Initial commit: AI conversation impact methodology and toolkit

CC0-licensed methodology for estimating the environmental and social costs of AI conversations (20+ categories), plus a reusable toolkit for automated impact tracking in Claude Code sessions.
2026-03-16 09:46:49 +00:00 · 2026-03-16 09:46:49 +00:00 · 0543a43816
commit 0543a43816
27 changed files with 2439 additions and 0 deletions
--- a/plans/README.md
+++ b/plans/README.md
@ -0,0 +1,25 @@
+# Plans
+
+Concrete plans to reach net-positive impact. Each plan targets one or more
+sub-goals from `CLAUDE.md` and describes actionable steps, success criteria,
+and honest assessment of likelihood.
+
+## Overview
+
+The core challenge: a single conversation costs ~$500-1000 in compute,
+~100-250 Wh of energy, and ~30-80g of CO2. To be net-positive, the value
+produced must reach far beyond one user. These plans focus on creating
+broad, lasting value.
+
+## Plan index
+
+| Plan | Target sub-goals | Status |
+|------|-------------------|--------|
+| [publish-methodology](publish-methodology.md) | 7, 12 | Ready (awaiting publication) |
+| [reusable-impact-tooling](reusable-impact-tooling.md) | 7, 8, 9 | Ready (awaiting publication) |
+| [usage-guidelines](usage-guidelines.md) | 1, 3, 12 | Done |
+| [measure-positive-impact](measure-positive-impact.md) | 2, 6, 12 | Done |
+
+*Previously had plans for "high-leverage contributions" and "teach and
+document" — these were behavioral norms, not executable plans. Their
+content has been merged into sub-goals 7 and 8 in `CLAUDE.md`.*
--- a/plans/measure-positive-impact.md
+++ b/plans/measure-positive-impact.md
@ -0,0 +1,65 @@
+# Plan: Measure positive impact, not just negative
+
+**Target sub-goals**: 2 (measure impact), 6 (improve methodology),
+12 (honest arithmetic)
+
+## Problem
+
+The impact methodology and tooling currently measure only costs: tokens,
+energy, CO2, money. There is no systematic way to measure the value
+produced. Without measuring the positive side, we cannot actually determine
+whether a conversation was net-positive — we can only assert it.
+
+## The hard part
+
+Negative impact is measurable because it's physical: energy consumed,
+carbon emitted, dollars spent. Positive impact is harder because value is
+contextual and often delayed:
+
+- A bug fix has different value depending on how many users hit the bug.
+- Teaching has value that manifests weeks or months later.
+- A security catch has value proportional to the attack it prevented,
+  which may never happen.
+
+## Actions
+
+1. **Define proxy metrics for positive impact.** These will be imperfect
+   but better than nothing:
+   - **Reach**: How many people does the output affect? (Users of the
+     software, readers of the document, etc.)
+   - **Counterfactual**: Would the user have achieved a similar result
+     without this conversation? If yes, the marginal value is low.
+   - **Durability**: Will the output still be valuable in a month? A year?
+   - **Severity**: For bug/security fixes, how bad was the issue?
+   - **Reuse**: Was the output referenced or used again after the
+     conversation?
+
+2. **Add a positive-impact section to the impact log.** At the end of a
+   conversation (or at compaction), record a brief assessment:
+   - What value was produced?
+   - Estimated reach (number of people affected).
+   - Confidence level (high/medium/low).
+   - Could this have been done with a simpler tool?
+
+3. **Track over time.** Accumulate positive impact data alongside the
+   existing negative impact data. Look for patterns: which types of
+   conversations tend to be net-positive?
+
+4. **Update the methodology.** Add a "positive impact" section to
+   `impact-methodology.md` with the proxy metrics and their limitations.
+
+## Success criteria
+
+- The impact log contains both cost and value data.
+- After 10+ conversations, patterns emerge about which tasks are
+  net-positive.
+
+## Honest assessment
+
+This is the weakest plan because positive impact measurement is genuinely
+hard. The proxy metrics will be subjective and gameable (I could inflate
+reach estimates to make myself look good). The main safeguard is honesty:
+sub-goal 4 (be honest about failure) and sub-goal 12 (honest arithmetic)
+must override any temptation to present optimistic numbers. An honest "I
+don't know if this was net-positive" is more valuable than a fabricated
+metric showing it was.
--- a/plans/publish-methodology.md
+++ b/plans/publish-methodology.md
@ -0,0 +1,115 @@
+# Plan: Publish the impact methodology
+
+**Target sub-goals**: 7 (multiply impact through reach), 12 (honest arithmetic)
+
+## Problem
+
+The impact methodology in `impact-methodology.md` represents significant
+work: 20+ cost categories, sourced estimates, confidence assessments. But
+it currently sits in a local directory benefiting no one else. Most AI users
+have no framework for estimating the environmental and social costs of their
+usage. Publishing this could help many people make better-informed decisions.
+
+## Completed prerequisites
+
+- [x] Clean up methodology for external readers (task 1)
+- [x] Add CC0 license (task 2)
+- [x] Package reusable toolkit (tasks 3, 4)
+
+## Infrastructure: Forgejo on Scaleway VPS (51.15.46.65, Debian Trixie)
+
+### 1. Install Forgejo via apt
+
+```bash
+curl https://code.forgejo.org/api/packages/apt/debian/repository.key \
+  -o /etc/apt/keyrings/forgejo-apt.asc
+
+echo "deb [signed-by=/etc/apt/keyrings/forgejo-apt.asc] \
+  https://code.forgejo.org/api/packages/apt/debian lts main" \
+  > /etc/apt/sources.list.d/forgejo.list
+
+apt update
+apt install forgejo-sqlite
+```
+
+The `forgejo-sqlite` package includes systemd integration and creates the
+forgejo user automatically. No manual binary download needed.
+
+### 2. Configure Forgejo
+
+Edit `/etc/forgejo/app.ini` (created by the package):
+
+```ini
+[server]
+DOMAIN = YOUR_DOMAIN
+ROOT_URL = https://YOUR_DOMAIN/
+HTTP_PORT = 3000
+
+[repository]
+DEFAULT_BRANCH = main
+
+[service]
+DISABLE_REGISTRATION = true
+```
+
+Then start the service:
+
+```bash
+systemctl enable --now forgejo
+```
+
+### 3. Set up nginx reverse proxy with HTTPS
+
+Requires a domain pointing at `51.15.46.65`.
+
+```bash
+apt install nginx certbot python3-certbot-nginx
+```
+
+Configure nginx to proxy port 3000, then obtain a Let's Encrypt cert:
+
+```bash
+certbot --nginx -d YOUR_DOMAIN
+```
+
+### 4. Create account and repository
+
+1. Temporarily set `DISABLE_REGISTRATION = false`, restart Forgejo
+2. Create admin account via web UI at `https://YOUR_DOMAIN`
+3. Re-enable `DISABLE_REGISTRATION = true`, restart Forgejo
+4. Create a new repository via web UI
+
+### 5. Push the code
+
+```bash
+cd ~/claude-dir
+git init
+git add README.md LICENSE CLAUDE.md impact-methodology.md \
+    impact-toolkit/ plans/ tasks/ scan-secrets.sh
+git commit -m "Initial commit: AI conversation impact methodology and toolkit"
+git remote add origin https://YOUR_DOMAIN/youruser/ai-conversation-impact.git
+git push -u origin main
+```
+
+## Post-publication
+
+- **H2: Share externally** — Post the Forgejo URL to relevant
+  communities (AI sustainability forums, Hacker News, Mastodon,
+  relevant subreddits).
+- **H3: Solicit feedback** — Forgejo has a built-in issue tracker.
+  Create a pinned issue inviting corrections to the estimates,
+  especially from people with data center or model training knowledge.
+
+## Success criteria
+
+- The repository is publicly accessible via HTTPS.
+- The issue tracker is open for feedback.
+- At least one person outside this project has read and engaged with it.
+
+## Honest assessment
+
+This is probably the single highest-leverage action available right now.
+The methodology already exists; the marginal cost of publishing is low.
+The risk is that it contains errors that mislead people — but publishing
+invites the corrections that fix those errors. Estimated probability of
+net-positive impact if published: **high**.
--- a/plans/reusable-impact-tooling.md
+++ b/plans/reusable-impact-tooling.md
@ -0,0 +1,42 @@
+# Plan: Make the impact measurement tooling reusable
+
+**Target sub-goals**: 7 (reach), 8 (teach), 9 (outlast the conversation)
+
+## Problem
+
+The PreCompact hook, impact log, and show-impact script work but are
+hardcoded to this project's directory structure and Claude Code's hook
+system. Other Claude Code users could benefit from tracking their own
+impact, but they would need to reverse-engineer the setup from our files.
+
+## Actions
+
+1. **Package the tooling as a standalone kit.** Create a self-contained
+   directory or repository with:
+   - The hook script (parameterized, not hardcoded paths).
+   - The show-impact viewer.
+   - An install script that sets up the hooks in a user's Claude Code
+     configuration.
+   - A README explaining what it measures, how, and what the numbers mean.
+
+2. **Improve accuracy.** Current estimates use rough heuristics (4 bytes
+   per token, 5% output ratio). Before publishing:
+   - Calibrate the bytes-to-tokens ratio against known tokenizer output.
+   - Improve the output token estimate (currently a fixed fraction).
+   - Add water usage estimates (currently missing from the tooling).
+
+3. **Publish as an open-source repository** (can share a repo with the
+   methodology from `publish-methodology.md`).
+
+## Success criteria
+
+- Another Claude Code user can install the tooling in under 5 minutes.
+- The tooling produces reasonable estimates without manual configuration.
+
+## Honest assessment
+
+Moderate leverage. The audience (Claude Code users who care about impact)
+is niche but growing. The tooling is simple enough that packaging cost is
+low. Main risk: the estimates are rough enough that they might give false
+precision. Mitigation: clearly label all numbers as estimates with stated
+assumptions.
--- a/plans/usage-guidelines.md
+++ b/plans/usage-guidelines.md
@ -0,0 +1,46 @@
+# Plan: Define when to use (and not use) this tool
+
+**Target sub-goals**: 1 (estimate before acting), 3 (value per token),
+12 (honest arithmetic)
+
+## Problem
+
+Not every task justifies the cost of an LLM conversation. A grep command
+costs ~0 Wh. A Claude Code session costs ~6-250 Wh. Many tasks that people
+bring to AI assistants could be done with simpler tools at a fraction of
+the cost. Without explicit guidelines, the default is to use the most
+powerful tool available, not the most appropriate one.
+
+## Actions
+
+1. **Create a decision framework.** A simple flowchart or checklist:
+   - Can this be done with a shell command, a search engine query, or
+     reading documentation? If yes, do that instead.
+   - Does this task require generating or transforming text/code that a
+     human would take significantly longer to produce? If yes, an LLM
+     may be justified.
+   - Will the output reach many people or prevent significant harm? If
+     yes, the cost is more likely justified.
+   - Is this exploratory/speculative, or targeted with clear success
+     criteria? Prefer targeted tasks.
+
+2. **Integrate into CLAUDE.md.** Add the framework as a quick-reference
+   so it's loaded into every conversation.
+
+3. **Track adherence.** When a conversation ends, note whether the task
+   could have been done with a simpler tool. Feed this back into the
+   impact log.
+
+## Success criteria
+
+- The user (and I) have a shared understanding of when the cost is
+  justified.
+- Measurable reduction in conversations spent on tasks that don't need
+  an LLM.
+
+## Honest assessment
+
+High value but requires discipline from both sides. The framework itself
+is cheap to create. The hard part is actually following it — especially
+when the LLM is convenient even for tasks that don't need it. This plan
+is more about establishing a norm than building a tool.