diff --git a/plans/README.md b/plans/README.md index e7f29f8..cebe546 100644 --- a/plans/README.md +++ b/plans/README.md @@ -19,6 +19,9 @@ broad, lasting value. | [reusable-impact-tooling](reusable-impact-tooling.md) | 7, 8, 9 | Published | | [usage-guidelines](usage-guidelines.md) | 1, 3, 12 | Done | | [measure-positive-impact](measure-positive-impact.md) | 2, 6, 12 | Done | +| [competitive-landscape](competitive-landscape.md) | 7, 12 | New — pre-launch | +| [audience-analysis](audience-analysis.md) | 7 | New — pre-launch | +| [measure-project-impact](measure-project-impact.md) | 2, 12 | New — pre-launch | *Previously had plans for "high-leverage contributions" and "teach and document" — these were behavioral norms, not executable plans. Their diff --git a/plans/audience-analysis.md b/plans/audience-analysis.md new file mode 100644 index 0000000..6644bc4 --- /dev/null +++ b/plans/audience-analysis.md @@ -0,0 +1,111 @@ +# Plan: Audience analysis + +**Target sub-goals**: 7 (multiply impact through reach) + +## Problem + +"Share it on Hacker News" is not a strategy. We need to know who +specifically would benefit from this work, what they need, and whether +our project delivers it in a form they can use. + +## Audience segments + +### A. AI ethics and governance professionals + +**Who**: AI Now Institute, Partnership on AI, Mozilla Foundation, +researchers at FAccT/AIES conferences. + +**Why they care**: Social/political costs of AI are discussed qualitatively +but never quantified or organized into an actionable taxonomy. Our framework +is the only one that enumerates deskilling, epistemic pollution, power +concentration, etc. alongside environmental costs. + +**What they need from us**: A citable methodology (ideally with a DOI). +Honest confidence intervals. CC0 licensing so they can build on it. + +**Conviction level**: High — we fill a gap no one else addresses. + +### B. Sustainability researchers + +**Who**: Academics publishing on AI environmental footprint (MIT, Columbia +Climate School, Stanford HAI). + +**Why they care**: Fragmented estimates, no shared taxonomy, low-confidence +numbers. Our 20+ category framework provides structure. + +**What they need from us**: Peer-reviewable methodology. Transparent +sourcing. Calibration against published data (Google, "How Hungry is AI"). + +**Conviction level**: Medium — our environmental estimates are less +rigorous than specialized tools. Value is in the breadth, not depth. + +### C. Corporate ESG teams + +**Who**: Companies subject to CSRD, GRI, ISSB S1/S2 disclosure mandates. + +**Why they care**: EU AI Act Article 51 requires energy consumption +disclosure for GPAI models (enforcement begins August 2026). No accepted +methodology exists yet for AI-specific reporting. + +**What they need from us**: Alignment with reporting standards. Auditability. +Probably more rigor than we currently have. + +**Conviction level**: Low today — we lack the institutional credibility +and audit trail they need. But our taxonomy could inform standards bodies. + +### D. AI developers who care + +**Who**: Engineers on HN, r/MachineLearning, open-source communities. + +**Why they care**: Curiosity, guilt, genuine concern. Want simple honest +numbers. + +**What they need from us**: The toolkit (installable, low friction). The +landing page numbers (100-250 Wh, 30-80g CO2). Something they can share. + +**Conviction level**: Medium — depends on presentation quality and whether +the numbers feel credible. + +### E. Policy makers + +**Who**: EU AI Act implementers, NIST (directed by Congress to develop +measurement standards), ISO SC 42. + +**Why they care**: Mandates exist but implementation standards lag. + +**What they need from us**: Probably nothing directly — they need +institutional input. But our taxonomy could be useful as a reference +if it gains traction with segments A and B first. + +**Conviction level**: Low for direct adoption, but indirect influence +is possible. + +## Primary audience for launch + +**Segment A (ethics/governance) and D (developers)** are our best targets: + +- Segment A values exactly what makes us unique (social cost taxonomy) +- Segment D is reachable via HN/Reddit and can use the toolkit immediately +- Both can provide the feedback we need to improve + +Segments B, C, and E are secondary — they may discover us through A and D. + +## What we need to change before launch + +1. The landing page should lead with what makes us unique (breadth beyond + carbon) rather than just the environmental numbers +2. The methodology needs a "Related work" section so researchers see we + know the landscape +3. The toolkit should link to EcoLogits/CodeCarbon for users who want + more precise environmental measurement + +## Communities to target + +| Community | Segment | Format | +|-----------|---------|--------| +| Hacker News | D | Link post | +| r/MachineLearning | B, D | [Project] thread | +| FAccT / AIES mailing lists | A | Direct email to researchers | +| awesome-green-ai GitHub | B, D | PR to add our project | +| Partnership on AI | A | Contact form / email | +| Mastodon #AIethics | A, D | Thread | diff --git a/plans/competitive-landscape.md b/plans/competitive-landscape.md new file mode 100644 index 0000000..d6cfc0d --- /dev/null +++ b/plans/competitive-landscape.md @@ -0,0 +1,74 @@ +# Plan: Competitive landscape analysis + +**Target sub-goals**: 7 (multiply impact through reach), 12 (honest arithmetic) + +## Problem + +Before sharing the project, we need to know what already exists so we can +position honestly. If a better alternative exists, we should point people +to it rather than duplicating effort. + +## Landscape (as of March 2026) + +### Tools that measure energy/carbon + +| Tool | Scope | Covers social costs? | Per-conversation? | +|------|-------|---------------------|-------------------| +| [CodeCarbon](https://codecarbon.io/) | Training energy/CO2 | No | No | +| [EcoLogits](https://ecologits.ai/) | Inference energy/CO2 via APIs | No | Yes | +| [ML CO2 Impact](https://mlco2.github.io/impact/) | Training carbon estimate | No | No | +| [Green Algorithms](https://www.green-algorithms.org/) | Any compute workload | No | No | +| [HF AI Energy Score](https://huggingface.github.io/AIEnergyScore/) | Model efficiency benchmark | No | No | + +### Published research with per-query data + +- **Google/Patterson et al. (Aug 2025)**: 0.24 Wh, 0.03g CO2, 0.26 mL + water per median Gemini text prompt. Most rigorous provider-published + data. Environmental only. + ([arXiv:2508.15734](https://arxiv.org/abs/2508.15734)) +- **"How Hungry is AI?" (Jegham et al., May 2025)**: Cross-model + benchmarks for 30 LLMs. o3 and DeepSeek-R1 consume >33 Wh for long + prompts. Claude 3.7 Sonnet ranked highest eco-efficiency. + ([arXiv:2505.09598](https://arxiv.org/abs/2505.09598)) + +### Frameworks that go broader + +- **UNICC/Frugal AI Hub (Dec 2025)**: TCO + SDG alignment. Portfolio-level, + not per-conversation. No specific social cost categories. +- **CHI 2025 deskilling research**: Empirical evidence that AI assistance + reduces critical thinking. Academic finding, not a measurement tool. +- **Oxford "Hidden Cost of AI" (2025)**: Descriptive survey of social costs. + Not quantitative or actionable. + +### What no one else does + +No existing tool or framework combines per-conversation environmental +measurement with social/cognitive/political cost categories. The tools +that measure well (CodeCarbon, EcoLogits) only cover environmental +dimensions. The research that names social costs is descriptive, not +actionable. + +## Our positioning + +**Honest differentiator**: We are the only framework that enumerates 20+ +cost categories — environmental, financial, social, epistemic, political — +at per-conversation granularity. + +**Honest weakness**: Our environmental estimates have lower confidence than +Google's or EcoLogits' because we don't have access to infrastructure data. +Our social cost categories are named and described but mostly not +quantified. + +**We are not competing with**: CodeCarbon, EcoLogits, or AI Energy Score. +These are measurement tools for specific environmental metrics. We are a +taxonomy and framework. We should reference and link to them, not +position against them. + +## Tasks + +- [ ] Add a "Related work" section to `impact-methodology.md` citing the + tools and research above, with honest comparison +- [ ] Calibrate our energy estimates against Google's published data + and the "How Hungry is AI" benchmarks +- [ ] Link to EcoLogits and CodeCarbon from the toolkit README as + complementary tools diff --git a/plans/measure-project-impact.md b/plans/measure-project-impact.md new file mode 100644 index 0000000..90a083d --- /dev/null +++ b/plans/measure-project-impact.md @@ -0,0 +1,102 @@ +# Plan: Measure the positive impact of this project + +**Target sub-goals**: 2 (measure impact), 12 (honest arithmetic) + +## Problem + +We built a framework for measuring AI conversation impact but have no +plan for measuring the impact of the framework itself. Without this, +we cannot know whether the project is net-positive. + +## Costs of the project so far + +Rough estimates across all conversations: + +- ~5-10 long conversations × ~$500-1000 compute each = **$2,500-10,000** +- ~500-2,500 Wh energy, ~150-800g CO2 +- VPS + domain ongoing: ~$10-20/month +- Human time: significant (harder to quantify) + +## What "net-positive" would look like + +The project is net-positive if the value it creates exceeds these costs. +Given the scale of costs, the value must reach significantly beyond one +person. Concretely: + +### Threshold 1: Minimal justification + +- 10+ people read the methodology and find it useful +- 1+ external correction improves accuracy +- 1+ other project adopts the toolkit or cites the methodology + +### Threshold 2: Clearly net-positive + +- 100+ unique visitors who engage (not just bounce) +- 5+ external contributions (issues, corrections, adaptations) +- Cited in 1+ academic paper or policy document +- 1+ organization uses the framework for actual reporting + +### Threshold 3: High impact + +- Adopted or referenced by a standards body or major org +- Influences how other AI tools report their environmental impact +- Methodology contributes to regulatory implementation (EU AI Act, etc.) + +## What to measure + +### Quantitative (automated where possible) + +| Metric | How to measure | Tool | +|--------|---------------|------| +| Unique visitors | Web server logs | nginx access log analysis | +| Page engagement | Time on page, scroll depth | Minimal JS or log analysis | +| Repository views | Forgejo built-in stats | Forgejo admin panel | +| Stars / forks | Forgejo API | Script or manual check | +| Issues opened | Forgejo API | Notification | +| External links | Referrer logs, web search | nginx logs + periodic search | +| Citations | Google Scholar alerts | Manual periodic check | + +### Qualitative (manual) + +| Metric | How to measure | +|--------|---------------| +| Quality of feedback | Read issues, assess substance | +| Adoption evidence | Search for references to the project | +| Influence on policy/standards | Monitor EU AI Act implementation, NIST | +| Corrections received | Count and assess accuracy improvements | + +## Implementation + +### Phase 1: Basic analytics (before launch) + +- [ ] Set up nginx access log rotation and a simple log analysis script + (no third-party analytics — respect visitors, minimize infrastructure) +- [ ] Create a script that queries Forgejo API for repo stats + (stars, forks, issues, unique cloners) +- [ ] Add a `project-impact-log.md` file to track observations manually + +### Phase 2: After launch + +- [ ] Check metrics weekly for the first month, then monthly +- [ ] Record observations in `project-impact-log.md` +- [ ] At 3 months post-launch, write an honest assessment: + did the project reach net-positive? + +### Phase 3: Long-term + +- [ ] Set up a Google Scholar alert for the methodology title +- [ ] Periodically search for references to llm-impact.org +- [ ] If the project is clearly net-negative at 6 months (no engagement, + no corrections, no adoption), acknowledge it honestly in the README + +## Honest assessment + +The most likely outcome is low engagement. Most open-source projects +get no traction. The methodology's value depends on whether the right +people find it — AI ethics researchers and sustainability-minded +developers. The landing page and initial sharing strategy are critical. + +If the project fails to reach threshold 1 within 3 months, we should +consider whether the energy spent maintaining the VPS is justified, or +whether the content should be archived as a static document and the +infrastructure shut down.