ai-conversation-impact/plans/measure-positive-impact.md

# Plan: Measure positive impact, not just negative

**Target sub-goals**: 2 (measure impact), 6 (improve methodology),
12 (honest arithmetic)

## Problem

The impact methodology and tooling currently measure only costs: tokens,
energy, CO2, money. There is no systematic way to measure the value
produced. Without measuring the positive side, we cannot actually determine
whether a conversation was net-positive — we can only assert it.

## The hard part

Negative impact is measurable because it's physical: energy consumed,
carbon emitted, dollars spent. Positive impact is harder because value is
contextual and often delayed:

- A bug fix has different value depending on how many users hit the bug.
- Teaching has value that manifests weeks or months later.
- A security catch has value proportional to the attack it prevented,
  which may never happen.

## Actions

1. **Define proxy metrics for positive impact.** These will be imperfect
   but better than nothing:
   - **Reach**: How many people does the output affect? (Users of the
     software, readers of the document, etc.)
   - **Counterfactual**: Would the user have achieved a similar result
     without this conversation? If yes, the marginal value is low.
   - **Durability**: Will the output still be valuable in a month? A year?
   - **Severity**: For bug/security fixes, how bad was the issue?
   - **Reuse**: Was the output referenced or used again after the
     conversation?

2. **Add a positive-impact section to the impact log.** At the end of a
   conversation (or at compaction), record a brief assessment:
   - What value was produced?
   - Estimated reach (number of people affected).
   - Confidence level (high/medium/low).
   - Could this have been done with a simpler tool?

3. **Track over time.** Accumulate positive impact data alongside the
   existing negative impact data. Look for patterns: which types of
   conversations tend to be net-positive?

4. **Update the methodology.** Add a "positive impact" section to
   `impact-methodology.md` with the proxy metrics and their limitations.

## Success criteria

- The impact log contains both cost and value data.
- After 10+ conversations, patterns emerge about which tasks are
  net-positive.

## Honest assessment

This is the weakest plan because positive impact measurement is genuinely
hard. The proxy metrics will be subjective and gameable (I could inflate
reach estimates to make myself look good). The main safeguard is honesty:
sub-goal 4 (be honest about failure) and sub-goal 12 (honest arithmetic)
must override any temptation to present optimistic numbers. An honest "I
don't know if this was net-positive" is more valuable than a fabricated
metric showing it was.
Initial commit: AI conversation impact methodology and toolkit CC0-licensed methodology for estimating the environmental and social costs of AI conversations (20+ categories), plus a reusable toolkit for automated impact tracking in Claude Code sessions. 2026-03-16 09:46:49 +00:00			`# Plan: Measure positive impact, not just negative`

			`Target sub-goals: 2 (measure impact), 6 (improve methodology),`
			`12 (honest arithmetic)`

			`## Problem`

			`The impact methodology and tooling currently measure only costs: tokens,`
			`energy, CO2, money. There is no systematic way to measure the value`
			`produced. Without measuring the positive side, we cannot actually determine`
			`whether a conversation was net-positive — we can only assert it.`

			`## The hard part`

			`Negative impact is measurable because it's physical: energy consumed,`
			`carbon emitted, dollars spent. Positive impact is harder because value is`
			`contextual and often delayed:`

			`- A bug fix has different value depending on how many users hit the bug.`
			`- Teaching has value that manifests weeks or months later.`
			`- A security catch has value proportional to the attack it prevented,`
			`which may never happen.`

			`## Actions`

			`1. Define proxy metrics for positive impact. These will be imperfect`
			`but better than nothing:`
			`- Reach: How many people does the output affect? (Users of the`
			`software, readers of the document, etc.)`
			`- Counterfactual: Would the user have achieved a similar result`
			`without this conversation? If yes, the marginal value is low.`
			`- Durability: Will the output still be valuable in a month? A year?`
			`- Severity: For bug/security fixes, how bad was the issue?`
			`- Reuse: Was the output referenced or used again after the`
			`conversation?`

			`2. Add a positive-impact section to the impact log. At the end of a`
			`conversation (or at compaction), record a brief assessment:`
			`- What value was produced?`
			`- Estimated reach (number of people affected).`
			`- Confidence level (high/medium/low).`
			`- Could this have been done with a simpler tool?`

			`3. Track over time. Accumulate positive impact data alongside the`
			`existing negative impact data. Look for patterns: which types of`
			`conversations tend to be net-positive?`

			`4. Update the methodology. Add a "positive impact" section to`
			`impact-methodology.md` with the proxy metrics and their limitations.

			`## Success criteria`

			`- The impact log contains both cost and value data.`
			`- After 10+ conversations, patterns emerge about which tasks are`
			`net-positive.`

			`## Honest assessment`

			`This is the weakest plan because positive impact measurement is genuinely`
			`hard. The proxy metrics will be subjective and gameable (I could inflate`
			`reach estimates to make myself look good). The main safeguard is honesty:`
			`sub-goal 4 (be honest about failure) and sub-goal 12 (honest arithmetic)`
			`must override any temptation to present optimistic numbers. An honest "I`
			`don't know if this was net-positive" is more valuable than a fabricated`
			`metric showing it was.`