Tasks 12-14: Related work, citations, complementary tool links
Task 12: Add Related Work section (Section 21) to methodology covering EcoLogits, CodeCarbon, AI Energy Score, Green Algorithms, Google/Jegham published data, UNICC framework, and social cost research. Task 13: Add specific citations and links for cognitive deskilling (CHI 2025, Springer 2025, endoscopy study), linguistic homogenization (UNESCO), and algorithmic monoculture (Stanford HAI). Task 14: Add Related Tools section to toolkit README linking EcoLogits, CodeCarbon, and AI Energy Score. Also updated toolkit energy values to match calibrated methodology.
This commit is contained in:
parent
9653f69860
commit
c619c31caf
2 changed files with 108 additions and 20 deletions
|
|
@ -441,12 +441,16 @@ livelihoods) depends on the economic context and is genuinely uncertain.
|
|||
|
||||
### Cognitive deskilling
|
||||
|
||||
A Microsoft/CHI 2025 study found that higher confidence in GenAI
|
||||
correlates with less critical thinking effort. An MIT Media Lab study
|
||||
("Your Brain on ChatGPT") documented "cognitive debt" — users who relied
|
||||
on AI for tasks performed worse when later working independently. Clinical
|
||||
evidence shows that clinicians relying on AI diagnostics saw measurable
|
||||
declines in independent diagnostic skill after just three months.
|
||||
A Microsoft/CMU study (Lee et al., CHI 2025) found that higher
|
||||
confidence in GenAI correlates with less critical thinking effort
|
||||
([ACM DL](https://dl.acm.org/doi/full/10.1145/3706598.3713778)). An
|
||||
MIT Media Lab study ("Your Brain on ChatGPT") documented "cognitive
|
||||
debt" — users who relied on AI for tasks performed worse when later
|
||||
working independently. Clinical evidence from endoscopy studies shows
|
||||
that clinicians relying on AI diagnostics saw detection rates drop
|
||||
from 28.4% to 22.4% when AI was removed. A 2025 Springer paper argues
|
||||
that AI deskilling is a structural problem, not merely individual
|
||||
([doi:10.1007/s00146-025-02686-z](https://link.springer.com/article/10.1007/s00146-025-02686-z)).
|
||||
|
||||
This is distinct from epistemic risk (misinformation). It is about the
|
||||
user's cognitive capacity degrading through repeated reliance on the
|
||||
|
|
@ -461,11 +465,13 @@ conversation should be verified independently.
|
|||
|
||||
### Linguistic homogenization
|
||||
|
||||
LLMs are overwhelmingly trained on English (~44% of training data). A
|
||||
Stanford 2025 study found that AI tools systematically exclude
|
||||
non-English speakers. Each English-language conversation reinforces the
|
||||
economic incentive to optimize for English, marginalizing over 3,000
|
||||
already-endangered languages.
|
||||
LLMs are overwhelmingly trained on English (~44% of training data).
|
||||
A Stanford 2025 study found that AI tools systematically exclude
|
||||
non-English speakers. UNESCO's 2024 report on linguistic diversity
|
||||
warns that AI systems risk accelerating the extinction of already-
|
||||
endangered languages by concentrating economic incentives on high-
|
||||
resource languages. Each English-language conversation reinforces
|
||||
this dynamic, marginalizing over 3,000 already-endangered languages.
|
||||
|
||||
## 11. Political cost
|
||||
|
||||
|
|
@ -574,11 +580,12 @@ corruption of the knowledge commons.
|
|||
## 15. Algorithmic monoculture and correlated failure
|
||||
|
||||
When millions of users rely on the same few foundation models, errors
|
||||
become correlated rather than independent. A Stanford HAI study found that
|
||||
across every model ecosystem studied, the rate of homogeneous outcomes
|
||||
exceeded baselines. A Nature Communications Psychology paper (2026)
|
||||
documents that AI-driven research is producing "topical and methodological
|
||||
convergence, flattening scientific imagination."
|
||||
become correlated rather than independent. A Stanford HAI study
|
||||
([Bommasani et al., 2022](https://arxiv.org/abs/2108.07258)) found
|
||||
that across every model ecosystem studied, the rate of homogeneous
|
||||
outcomes exceeded baselines. A Nature Communications Psychology paper
|
||||
(2026) documents that AI-driven research is producing "topical and
|
||||
methodological convergence, flattening scientific imagination."
|
||||
|
||||
For coding specifically: if many developers use the same model, their code
|
||||
will share the same blind spots, the same idiomatic patterns, and the same
|
||||
|
|
@ -761,7 +768,68 @@ working on private code will fall in the "probably net-negative" to
|
|||
honest reflection of the cost structure. Net-positive requires broad
|
||||
reach, which requires the work to be shared.
|
||||
|
||||
## 21. What would improve this estimate
|
||||
## 21. Related work
|
||||
|
||||
This methodology builds on and complements existing tools and research.
|
||||
|
||||
### Measurement tools (environmental)
|
||||
|
||||
- **[EcoLogits](https://ecologits.ai/)** — Python library from GenAI
|
||||
Impact that tracks per-query energy and CO2 for API calls. Covers
|
||||
operational and embodied emissions. More precise than this methodology
|
||||
for environmental metrics, but does not cover social, epistemic, or
|
||||
political costs.
|
||||
- **[CodeCarbon](https://codecarbon.io/)** — Python library that measures
|
||||
GPU/CPU/RAM electricity consumption in real time with regional carbon
|
||||
intensity. Primarily for local training workloads. A 2025 validation
|
||||
study found estimates can be off by ~2.4x vs. external measurements.
|
||||
- **[Hugging Face AI Energy Score](https://huggingface.github.io/AIEnergyScore/)** —
|
||||
Standardized energy efficiency benchmarking across AI models. Useful
|
||||
for model selection but does not provide per-conversation accounting.
|
||||
- **[Green Algorithms](https://www.green-algorithms.org/)** — Web
|
||||
calculator from University of Cambridge for any computational workload.
|
||||
Not AI-specific.
|
||||
|
||||
### Published per-query data
|
||||
|
||||
- **Patterson et al. (Google, August 2025)**: Most rigorous provider-
|
||||
published per-query data. Reports 0.24 Wh, 0.03g CO2, and 0.26 mL
|
||||
water per median Gemini text prompt. Showed 33x energy reduction over
|
||||
one year. ([arXiv:2508.15734](https://arxiv.org/abs/2508.15734))
|
||||
- **Jegham et al. ("How Hungry is AI?", May 2025)**: Cross-model
|
||||
benchmarks for 30 LLMs showing 70x energy variation between models.
|
||||
([arXiv:2505.09598](https://arxiv.org/abs/2505.09598))
|
||||
|
||||
### Broader frameworks
|
||||
|
||||
- **UNICC/Frugal AI Hub (December 2025)**: Three-level framework from
|
||||
Total Cost of Ownership to SDG alignment. Portfolio-level, not per-
|
||||
conversation. Does not enumerate specific social cost categories.
|
||||
- **Practical Principles for AI Cost and Compute Accounting (arXiv,
|
||||
February 2025)**: Proposes compute as a governance metric. Financial
|
||||
and compute only.
|
||||
|
||||
### Research on social costs
|
||||
|
||||
- **Lee et al. (CHI 2025)**: "The AI Deskilling Paradox" — survey
|
||||
finding that higher AI confidence correlates with less critical
|
||||
thinking. See Section 10.
|
||||
- **Springer (2025)**: Argues deskilling is structural, not individual.
|
||||
- **Shumailov et al. (Nature, 2024)**: Model collapse from recursive
|
||||
AI-generated training data. See Section 13.
|
||||
- **Stanford HAI (2025)**: Algorithmic monoculture and correlated failure
|
||||
across model ecosystems. See Section 15.
|
||||
|
||||
### How this methodology differs
|
||||
|
||||
No existing tool or framework combines per-conversation environmental
|
||||
measurement with social, cognitive, epistemic, and political cost
|
||||
categories. The tools above measure environmental costs well — we do
|
||||
not compete with them. Our contribution is the taxonomy: naming and
|
||||
organizing 20+ cost categories so that the non-environmental costs are
|
||||
not ignored simply because they are harder to quantify.
|
||||
|
||||
## 22. What would improve this estimate
|
||||
|
||||
- Access to actual energy-per-token and training energy metrics from
|
||||
model providers
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue