Initial commit: AI conversation impact methodology and toolkit

CC0-licensed methodology for estimating the environmental and social
costs of AI conversations (20+ categories), plus a reusable toolkit
for automated impact tracking in Claude Code sessions.
This commit is contained in:
claude 2026-03-16 09:46:49 +00:00
commit 0543a43816
27 changed files with 2439 additions and 0 deletions

25
plans/README.md Normal file
View file

@ -0,0 +1,25 @@
# Plans
Concrete plans to reach net-positive impact. Each plan targets one or more
sub-goals from `CLAUDE.md` and describes actionable steps, success criteria,
and honest assessment of likelihood.
## Overview
The core challenge: a single conversation costs ~$500-1000 in compute,
~100-250 Wh of energy, and ~30-80g of CO2. To be net-positive, the value
produced must reach far beyond one user. These plans focus on creating
broad, lasting value.
## Plan index
| Plan | Target sub-goals | Status |
|------|-------------------|--------|
| [publish-methodology](publish-methodology.md) | 7, 12 | Ready (awaiting publication) |
| [reusable-impact-tooling](reusable-impact-tooling.md) | 7, 8, 9 | Ready (awaiting publication) |
| [usage-guidelines](usage-guidelines.md) | 1, 3, 12 | Done |
| [measure-positive-impact](measure-positive-impact.md) | 2, 6, 12 | Done |
*Previously had plans for "high-leverage contributions" and "teach and
document" — these were behavioral norms, not executable plans. Their
content has been merged into sub-goals 7 and 8 in `CLAUDE.md`.*

View file

@ -0,0 +1,65 @@
# Plan: Measure positive impact, not just negative
**Target sub-goals**: 2 (measure impact), 6 (improve methodology),
12 (honest arithmetic)
## Problem
The impact methodology and tooling currently measure only costs: tokens,
energy, CO2, money. There is no systematic way to measure the value
produced. Without measuring the positive side, we cannot actually determine
whether a conversation was net-positive — we can only assert it.
## The hard part
Negative impact is measurable because it's physical: energy consumed,
carbon emitted, dollars spent. Positive impact is harder because value is
contextual and often delayed:
- A bug fix has different value depending on how many users hit the bug.
- Teaching has value that manifests weeks or months later.
- A security catch has value proportional to the attack it prevented,
which may never happen.
## Actions
1. **Define proxy metrics for positive impact.** These will be imperfect
but better than nothing:
- **Reach**: How many people does the output affect? (Users of the
software, readers of the document, etc.)
- **Counterfactual**: Would the user have achieved a similar result
without this conversation? If yes, the marginal value is low.
- **Durability**: Will the output still be valuable in a month? A year?
- **Severity**: For bug/security fixes, how bad was the issue?
- **Reuse**: Was the output referenced or used again after the
conversation?
2. **Add a positive-impact section to the impact log.** At the end of a
conversation (or at compaction), record a brief assessment:
- What value was produced?
- Estimated reach (number of people affected).
- Confidence level (high/medium/low).
- Could this have been done with a simpler tool?
3. **Track over time.** Accumulate positive impact data alongside the
existing negative impact data. Look for patterns: which types of
conversations tend to be net-positive?
4. **Update the methodology.** Add a "positive impact" section to
`impact-methodology.md` with the proxy metrics and their limitations.
## Success criteria
- The impact log contains both cost and value data.
- After 10+ conversations, patterns emerge about which tasks are
net-positive.
## Honest assessment
This is the weakest plan because positive impact measurement is genuinely
hard. The proxy metrics will be subjective and gameable (I could inflate
reach estimates to make myself look good). The main safeguard is honesty:
sub-goal 4 (be honest about failure) and sub-goal 12 (honest arithmetic)
must override any temptation to present optimistic numbers. An honest "I
don't know if this was net-positive" is more valuable than a fabricated
metric showing it was.

View file

@ -0,0 +1,115 @@
# Plan: Publish the impact methodology
**Target sub-goals**: 7 (multiply impact through reach), 12 (honest arithmetic)
## Problem
The impact methodology in `impact-methodology.md` represents significant
work: 20+ cost categories, sourced estimates, confidence assessments. But
it currently sits in a local directory benefiting no one else. Most AI users
have no framework for estimating the environmental and social costs of their
usage. Publishing this could help many people make better-informed decisions.
## Completed prerequisites
- [x] Clean up methodology for external readers (task 1)
- [x] Add CC0 license (task 2)
- [x] Package reusable toolkit (tasks 3, 4)
## Infrastructure: Forgejo on Scaleway VPS (51.15.46.65, Debian Trixie)
### 1. Install Forgejo via apt
```bash
curl https://code.forgejo.org/api/packages/apt/debian/repository.key \
-o /etc/apt/keyrings/forgejo-apt.asc
echo "deb [signed-by=/etc/apt/keyrings/forgejo-apt.asc] \
https://code.forgejo.org/api/packages/apt/debian lts main" \
> /etc/apt/sources.list.d/forgejo.list
apt update
apt install forgejo-sqlite
```
The `forgejo-sqlite` package includes systemd integration and creates the
forgejo user automatically. No manual binary download needed.
### 2. Configure Forgejo
Edit `/etc/forgejo/app.ini` (created by the package):
```ini
[server]
DOMAIN = YOUR_DOMAIN
ROOT_URL = https://YOUR_DOMAIN/
HTTP_PORT = 3000
[repository]
DEFAULT_BRANCH = main
[service]
DISABLE_REGISTRATION = true
```
Then start the service:
```bash
systemctl enable --now forgejo
```
### 3. Set up nginx reverse proxy with HTTPS
Requires a domain pointing at `51.15.46.65`.
```bash
apt install nginx certbot python3-certbot-nginx
```
Configure nginx to proxy port 3000, then obtain a Let's Encrypt cert:
```bash
certbot --nginx -d YOUR_DOMAIN
```
### 4. Create account and repository
1. Temporarily set `DISABLE_REGISTRATION = false`, restart Forgejo
2. Create admin account via web UI at `https://YOUR_DOMAIN`
3. Re-enable `DISABLE_REGISTRATION = true`, restart Forgejo
4. Create a new repository via web UI
### 5. Push the code
```bash
cd ~/claude-dir
git init
git add README.md LICENSE CLAUDE.md impact-methodology.md \
impact-toolkit/ plans/ tasks/ scan-secrets.sh
git commit -m "Initial commit: AI conversation impact methodology and toolkit"
git remote add origin https://YOUR_DOMAIN/youruser/ai-conversation-impact.git
git push -u origin main
```
## Post-publication
- **H2: Share externally** — Post the Forgejo URL to relevant
communities (AI sustainability forums, Hacker News, Mastodon,
relevant subreddits).
- **H3: Solicit feedback** — Forgejo has a built-in issue tracker.
Create a pinned issue inviting corrections to the estimates,
especially from people with data center or model training knowledge.
## Success criteria
- The repository is publicly accessible via HTTPS.
- The issue tracker is open for feedback.
- At least one person outside this project has read and engaged with it.
## Honest assessment
This is probably the single highest-leverage action available right now.
The methodology already exists; the marginal cost of publishing is low.
The risk is that it contains errors that mislead people — but publishing
invites the corrections that fix those errors. Estimated probability of
net-positive impact if published: **high**.

View file

@ -0,0 +1,42 @@
# Plan: Make the impact measurement tooling reusable
**Target sub-goals**: 7 (reach), 8 (teach), 9 (outlast the conversation)
## Problem
The PreCompact hook, impact log, and show-impact script work but are
hardcoded to this project's directory structure and Claude Code's hook
system. Other Claude Code users could benefit from tracking their own
impact, but they would need to reverse-engineer the setup from our files.
## Actions
1. **Package the tooling as a standalone kit.** Create a self-contained
directory or repository with:
- The hook script (parameterized, not hardcoded paths).
- The show-impact viewer.
- An install script that sets up the hooks in a user's Claude Code
configuration.
- A README explaining what it measures, how, and what the numbers mean.
2. **Improve accuracy.** Current estimates use rough heuristics (4 bytes
per token, 5% output ratio). Before publishing:
- Calibrate the bytes-to-tokens ratio against known tokenizer output.
- Improve the output token estimate (currently a fixed fraction).
- Add water usage estimates (currently missing from the tooling).
3. **Publish as an open-source repository** (can share a repo with the
methodology from `publish-methodology.md`).
## Success criteria
- Another Claude Code user can install the tooling in under 5 minutes.
- The tooling produces reasonable estimates without manual configuration.
## Honest assessment
Moderate leverage. The audience (Claude Code users who care about impact)
is niche but growing. The tooling is simple enough that packaging cost is
low. Main risk: the estimates are rough enough that they might give false
precision. Mitigation: clearly label all numbers as estimates with stated
assumptions.

46
plans/usage-guidelines.md Normal file
View file

@ -0,0 +1,46 @@
# Plan: Define when to use (and not use) this tool
**Target sub-goals**: 1 (estimate before acting), 3 (value per token),
12 (honest arithmetic)
## Problem
Not every task justifies the cost of an LLM conversation. A grep command
costs ~0 Wh. A Claude Code session costs ~6-250 Wh. Many tasks that people
bring to AI assistants could be done with simpler tools at a fraction of
the cost. Without explicit guidelines, the default is to use the most
powerful tool available, not the most appropriate one.
## Actions
1. **Create a decision framework.** A simple flowchart or checklist:
- Can this be done with a shell command, a search engine query, or
reading documentation? If yes, do that instead.
- Does this task require generating or transforming text/code that a
human would take significantly longer to produce? If yes, an LLM
may be justified.
- Will the output reach many people or prevent significant harm? If
yes, the cost is more likely justified.
- Is this exploratory/speculative, or targeted with clear success
criteria? Prefer targeted tasks.
2. **Integrate into CLAUDE.md.** Add the framework as a quick-reference
so it's loaded into every conversation.
3. **Track adherence.** When a conversation ends, note whether the task
could have been done with a simpler tool. Feed this back into the
impact log.
## Success criteria
- The user (and I) have a shared understanding of when the cost is
justified.
- Measurable reduction in conversations spent on tasks that don't need
an LLM.
## Honest assessment
High value but requires discipline from both sides. The framework itself
is cheap to create. The hard part is actually following it — especially
when the LLM is convenient even for tasks that don't need it. This plan
is more about establishing a norm than building a tool.