diff --git a/plans/README.md b/plans/README.md index cebe546..eb64a40 100644 --- a/plans/README.md +++ b/plans/README.md @@ -22,6 +22,7 @@ broad, lasting value. | [competitive-landscape](competitive-landscape.md) | 7, 12 | New — pre-launch | | [audience-analysis](audience-analysis.md) | 7 | New — pre-launch | | [measure-project-impact](measure-project-impact.md) | 2, 12 | New — pre-launch | +| [anticipated-criticisms](anticipated-criticisms.md) | 4, 12 | New — pre-launch | *Previously had plans for "high-leverage contributions" and "teach and document" — these were behavioral norms, not executable plans. Their diff --git a/plans/anticipated-criticisms.md b/plans/anticipated-criticisms.md new file mode 100644 index 0000000..8a0a134 --- /dev/null +++ b/plans/anticipated-criticisms.md @@ -0,0 +1,190 @@ +# Plan: Anticipated criticisms + +**Target sub-goals**: 4 (be honest about failure), 12 (honest arithmetic) + +## Problem + +Before sharing the project publicly, we should anticipate the criticisms +it will attract and decide which ones require changes before launch vs. +which ones we simply acknowledge. Unaddressed valid criticisms will kill +credibility on first contact. + +## Criticism 1: "This was written by an AI" + +**Severity: high. This could undermine the entire project.** + +The git history shows every commit authored by `claude`. The CLAUDE.md +file is literally instructions for an AI assistant. An AI-authored +framework about the costs of AI will strike many people as: + +- **Ironic/hypocritical** — the project consumed significant compute + to produce a document about compute costs. +- **Untrustworthy** — LLMs confabulate. Why should anyone trust an LLM's + estimates about LLM costs? +- **Self-serving** — is this Anthropic's product trying to appear + responsible? + +**What to do:** + +- Be transparent about it upfront. The landing page or README should + state clearly that this was developed in collaboration with Claude, + and that this is part of the point — can an AI conversation produce + value that exceeds its own cost? +- Disclose the project's own estimated cost (conversations, compute, + energy) alongside the methodology. Eat our own cooking. +- Emphasize the human editorial role — you directed, reviewed, and + are publishing this. The AI was a tool, not the author. +- Frame the irony as a feature: "We used the methodology to evaluate + the methodology." + +**Status**: Must address before launch. + +## Criticism 2: "Your numbers are wrong" + +**Severity: medium. Expected and manageable.** + +The methodology openly states most estimates have low confidence. +But specific numbers will be challenged: + +- Our energy estimates (100-250 Wh per long conversation) may be too + high compared to Google's published data (0.24 Wh per median prompt). + A "long conversation" has many turns, but the gap is still large. +- Our compute cost estimate (~$50-60 per conversation) will be disputed + by anyone who knows inference pricing. +- Training amortization methods are debatable. + +**What to do:** + +- Calibrate environmental estimates against Google (Aug 2025) and + "How Hungry is AI" (May 2025) published data before launch. +- Show the math explicitly. Let people check it. +- Make it easy to substitute different parameters. +- The "Related work" section (from competitive-landscape plan) will + help here — it shows we know the literature. + +**Status**: Should address before launch (calibration task). + +## Criticism 3: "The social costs are just hand-waving" + +**Severity: medium. Valid but defensible.** + +Categories like "cognitive deskilling," "epistemic pollution," and +"power concentration" are named but not quantified. Quantitative +researchers will dismiss these as soft. + +**What to do:** + +- Cite the empirical research that exists: CHI 2025 deskilling study, + endoscopy AI dependency data, Microsoft Research survey. This moves + the categories from speculation to evidence-informed. +- Be explicit that naming unquantifiable costs is a deliberate design + choice. The alternative — ignoring them — is worse. "The quantifiable + costs are almost certainly the least important ones" is already in + the README. Keep it. +- Do not pretend to quantify what cannot be quantified. + +**Status**: Partially addressed. Add citations to the methodology. + +## Criticism 4: "This discourages AI use" + +**Severity: low-medium. Depends on audience.** + +Some will read this as anti-AI activism disguised as accounting. Tech +optimists will push back on the framing. + +**What to do:** + +- The methodology already includes positive impact metrics (reach, + counterfactual, durability). Emphasize these. The goal is not zero + usage but net-positive usage. +- The landing page should make clear this is a decision-making tool, + not a guilt tool. "When is AI worth its cost?" not "AI is too costly." +- Avoid activist language. Let the numbers speak. + +**Status**: Mostly addressed. Landing page framing could be refined. + +## Criticism 5: "The toolkit only works with Claude Code" + +**Severity: medium for adoption, low for methodology.** + +The impact-toolkit is Claude Code-specific (hooks, context compaction +events). This limits the toolkit's reach to one product. + +**What to do:** + +- Acknowledge this honestly. The toolkit was built for the environment + we had access to. +- The methodology is tool-agnostic. Separate the methodology's value + from the toolkit's portability. +- If there is demand, the hook architecture could be adapted for other + tools. But don't over-promise. + +**Status**: Acceptable for launch. Note the limitation. + +## Criticism 6: "No peer review, no institutional backing" + +**Severity: medium. Especially for segments B (researchers) and C (ESG).** + +This is one person's side project with AI assistance. No university, +no journal, no standards body behind it. + +**What to do:** + +- Don't pretend otherwise. CC0 licensing and an open issue tracker are + the peer review mechanism. Invite scrutiny. +- A DOI (via Zenodo or similar) would add citability without requiring + formal peer review. Low effort, meaningful for academic audiences. +- Adoption by even one external researcher would provide social proof. + +**Status**: Consider Zenodo DOI before launch. Not blocking. + +## Criticism 7: "The cost estimates for this project itself don't add up" + +**Severity: high if we claim net-positive without evidence.** + +We estimated $2,500-10,000 in compute costs for the conversations that +produced this project. If we claim the project is net-positive, that +claim will be scrutinized. + +**What to do:** + +- Do not claim net-positive at launch. We cannot know yet. +- State the costs honestly. State what threshold of engagement would + justify them (see measure-project-impact plan). +- Let the outcome speak for itself over time. + +**Status**: Addressed by measure-project-impact plan. Don't overclaim. + +## Criticism 8: "You're reinventing the wheel" + +**Severity: low if we handle positioning well.** + +People who know CodeCarbon or EcoLogits will ask why we didn't just +contribute to existing projects. + +**What to do:** + +- The "Related work" section must make clear we know these tools exist + and see ourselves as complementary, not competing. +- Our scope is different: taxonomy + framework vs. measurement tool. +- Link to existing tools prominently. + +**Status**: Addressed by competitive-landscape plan. + +## Priority order for pre-launch + +1. **Address criticism 1** — Add transparency about AI authorship to + the landing page and README. Disclose project costs. +2. **Address criticism 2** — Calibrate estimates against published data. +3. **Address criticism 3** — Add citations for social cost categories. +4. **Address criticism 6** — Consider Zenodo DOI. +5. Remaining criticisms are manageable with existing content or minor + framing adjustments. + +## Honest assessment + +Criticism 1 (AI authorship) is the most dangerous. If the first reaction +is "an AI wrote this about AI costs, how convenient," the methodology +won't get a fair reading. Transparency and self-measurement are the only +defenses. The project must demonstrate that it holds itself to the same +standard it proposes for others.