Initial commit: AI conversation impact methodology and toolkit

CC0-licensed methodology for estimating the environmental and social costs of AI conversations (20+ categories), plus a reusable toolkit for automated impact tracking in Claude Code sessions.
2026-03-16 09:46:49 +00:00 · 2026-03-16 09:46:49 +00:00 · 0543a43816
commit 0543a43816
27 changed files with 2439 additions and 0 deletions
--- a/impact-toolkit/README.md
+++ b/impact-toolkit/README.md
@ -0,0 +1,73 @@
+# Claude Code Impact Toolkit
+
+Track the environmental and financial cost of your Claude Code
+conversations.
+
+## What it does
+
+A PreCompact hook that runs before each context compaction, capturing:
+- Token counts (actual from transcript or heuristic estimate)
+- Cache usage breakdown (creation vs. read)
+- Energy consumption estimate (Wh)
+- CO2 emissions estimate (grams)
+- Financial cost estimate (USD)
+
+Data is logged to a JSONL file for analysis over time.
+
+## Install
+
+```bash
+# Project-level (recommended)
+cd your-project
+./path/to/impact-toolkit/install.sh
+
+# Or user-level (applies to all projects)
+./path/to/impact-toolkit/install.sh --user
+```
+
+Requirements: `bash`, `jq`, `python3`.
+
+## View results
+
+```bash
+.claude/hooks/show-impact.sh              # all sessions
+.claude/hooks/show-impact.sh <session_id> # specific session
+```
+
+## How it works
+
+The hook fires before Claude Code compacts your conversation context.
+It reads the conversation transcript, extracts token usage data from
+API response metadata, and calculates cost estimates using:
+
+- **Energy**: 0.003 Wh/1K input tokens, 0.015 Wh/1K output tokens
+- **PUE**: 1.2 (data center overhead)
+- **CO2**: 325g/kWh (US grid average for cloud regions)
+- **Cost**: $15/M input tokens, $75/M output tokens
+
+Cache-read tokens are weighted at 10% of full cost (they skip most
+computation).
+
+## Limitations
+
+- All numbers are estimates with low to medium confidence.
+- Energy-per-token figures are derived from published research on
+  comparable models, not official Anthropic data.
+- The hook only runs on context compaction, not at conversation end.
+  Short conversations that never compact will not be logged.
+- See `impact-methodology.md` for the full methodology, uncertainty
+  analysis, and non-quantifiable costs.
+
+## Files
+
+```
+impact-toolkit/
+  install.sh                       # installer
+  hooks/pre-compact-snapshot.sh    # PreCompact hook
+  hooks/show-impact.sh             # log viewer
+  README.md                        # this file
+```
+
+## License
+
+MIT. See LICENSE in the repository root.
--- a/impact-toolkit/hooks/pre-compact-snapshot.sh
+++ b/impact-toolkit/hooks/pre-compact-snapshot.sh
@ -0,0 +1,137 @@
+#!/usr/bin/env bash
+#
+# pre-compact-snapshot.sh — Snapshot impact metrics before context compaction.
+#
+# Runs as a PreCompact hook. Reads the conversation transcript, extracts
+# actual token counts when available (falls back to heuristic estimates),
+# and appends a timestamped entry to the impact log.
+#
+# Input: JSON on stdin with fields: trigger, session_id, transcript_path, cwd
+# Output: nothing on stdout (hook succeeds silently). Logs to impact-log.jsonl.
+
+set -euo pipefail
+
+HOOK_INPUT=$(cat)
+PROJECT_DIR="${CLAUDE_PROJECT_DIR:-$(echo "$HOOK_INPUT" | jq -r '.cwd')}"
+TRANSCRIPT_PATH=$(echo "$HOOK_INPUT" | jq -r '.transcript_path')
+SESSION_ID=$(echo "$HOOK_INPUT" | jq -r '.session_id')
+TRIGGER=$(echo "$HOOK_INPUT" | jq -r '.trigger')
+TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
+
+LOG_DIR="$PROJECT_DIR/.claude/impact"
+LOG_FILE="$LOG_DIR/impact-log.jsonl"
+mkdir -p "$LOG_DIR"
+
+# --- Extract or estimate metrics from transcript ---
+
+if [ -f "$TRANSCRIPT_PATH" ]; then
+  TRANSCRIPT_BYTES=$(wc -c < "$TRANSCRIPT_PATH")
+  TRANSCRIPT_LINES=$(wc -l < "$TRANSCRIPT_PATH")
+
+  # Count tool uses
+  TOOL_USES=$(grep -c '"tool_use"' "$TRANSCRIPT_PATH" 2>/dev/null || echo 0)
+
+  # Try to extract actual token counts from usage fields in the transcript.
+  # The transcript contains .message.usage with input_tokens,
+  # cache_creation_input_tokens, cache_read_input_tokens, output_tokens.
+  USAGE_DATA=$(python3 -c "
+import json, sys
+input_tokens = 0
+cache_creation = 0
+cache_read = 0
+output_tokens = 0
+turns = 0
+with open(sys.argv[1]) as f:
+    for line in f:
+        try:
+            d = json.loads(line.strip())
+            u = d.get('message', {}).get('usage')
+            if u and 'input_tokens' in u:
+                turns += 1
+                input_tokens += u.get('input_tokens', 0)
+                cache_creation += u.get('cache_creation_input_tokens', 0)
+                cache_read += u.get('cache_read_input_tokens', 0)
+                output_tokens += u.get('output_tokens', 0)
+        except Exception:
+            pass
+# Print as tab-separated for easy shell parsing
+print(f'{turns}\t{input_tokens}\t{cache_creation}\t{cache_read}\t{output_tokens}')
+" "$TRANSCRIPT_PATH" 2>/dev/null || echo "")
+
+  if [ -n "$USAGE_DATA" ] && [ "$(echo "$USAGE_DATA" | cut -f1)" -gt 0 ] 2>/dev/null; then
+    # Actual token counts available
+    TOKEN_SOURCE="actual"
+    ASSISTANT_TURNS=$(echo "$USAGE_DATA" | cut -f1)
+    INPUT_TOKENS=$(echo "$USAGE_DATA" | cut -f2)
+    CACHE_CREATION=$(echo "$USAGE_DATA" | cut -f3)
+    CACHE_READ=$(echo "$USAGE_DATA" | cut -f4)
+    OUTPUT_TOKENS=$(echo "$USAGE_DATA" | cut -f5)
+
+    # Cumulative input = all tokens that went through the model.
+    # Cache reads are cheaper (~10-20% of full compute), so we weight them.
+    # Full-cost tokens: input_tokens + cache_creation_input_tokens
+    # Reduced-cost tokens: cache_read_input_tokens (weight at 0.1x for energy)
+    FULL_COST_INPUT=$(( INPUT_TOKENS + CACHE_CREATION ))
+    CACHE_READ_EFFECTIVE=$(( CACHE_READ / 10 ))
+    CUMULATIVE_INPUT=$(( FULL_COST_INPUT + CACHE_READ_EFFECTIVE ))
+    # Also track raw total for the log
+    CUMULATIVE_INPUT_RAW=$(( INPUT_TOKENS + CACHE_CREATION + CACHE_READ ))
+  else
+    # Fallback: heuristic estimation
+    TOKEN_SOURCE="heuristic"
+    ESTIMATED_TOKENS=$((TRANSCRIPT_BYTES / 4))
+    ASSISTANT_TURNS=$(grep -c '"role":\s*"assistant"' "$TRANSCRIPT_PATH" 2>/dev/null || echo 0)
+
+    if [ "$ASSISTANT_TURNS" -gt 0 ]; then
+      AVG_CONTEXT=$((ESTIMATED_TOKENS / 2))
+      CUMULATIVE_INPUT=$((AVG_CONTEXT * ASSISTANT_TURNS))
+    else
+      CUMULATIVE_INPUT=$ESTIMATED_TOKENS
+    fi
+    CUMULATIVE_INPUT_RAW=$CUMULATIVE_INPUT
+    OUTPUT_TOKENS=$((ESTIMATED_TOKENS / 20))
+    CACHE_CREATION=0
+    CACHE_READ=0
+    INPUT_TOKENS=0
+  fi
+
+  # --- Cost estimates ---
+  # Energy: 0.003 Wh per 1K input tokens, 0.015 Wh per 1K output tokens, PUE 1.2
+  # Using integer arithmetic in centiwatt-hours to avoid bc dependency
+  INPUT_CWH=$(( CUMULATIVE_INPUT * 3 / 10000 ))   # 0.003 Wh/1K = 3 cWh/10K
+  OUTPUT_CWH=$(( OUTPUT_TOKENS * 15 / 10000 ))     # 0.015 Wh/1K = 15 cWh/10K
+  ENERGY_CWH=$(( (INPUT_CWH + OUTPUT_CWH) * 12 / 10 ))  # PUE 1.2
+  ENERGY_WH=$(( ENERGY_CWH / 100 ))
+
+  # CO2: 325g/kWh -> 0.325g/Wh -> 325 mg/Wh
+  CO2_MG=$(( ENERGY_WH * 325 ))
+  CO2_G=$(( CO2_MG / 1000 ))
+
+  # Financial: $15/M input, $75/M output (in cents)
+  # Use effective cumulative input (cache-weighted) for cost too
+  COST_INPUT_CENTS=$(( CUMULATIVE_INPUT * 15 / 10000 ))  # $15/M = 1.5c/100K
+  COST_OUTPUT_CENTS=$(( OUTPUT_TOKENS * 75 / 10000 ))
+  COST_CENTS=$(( COST_INPUT_CENTS + COST_OUTPUT_CENTS ))
+else
+  TRANSCRIPT_BYTES=0
+  TRANSCRIPT_LINES=0
+  ASSISTANT_TURNS=0
+  TOOL_USES=0
+  CUMULATIVE_INPUT=0
+  CUMULATIVE_INPUT_RAW=0
+  OUTPUT_TOKENS=0
+  CACHE_CREATION=0
+  CACHE_READ=0
+  ENERGY_WH=0
+  CO2_G=0
+  COST_CENTS=0
+  TOKEN_SOURCE="none"
+fi
+
+# --- Write log entry ---
+
+cat >> "$LOG_FILE" <<EOF
+{"timestamp":"$TIMESTAMP","session_id":"$SESSION_ID","trigger":"$TRIGGER","token_source":"$TOKEN_SOURCE","transcript_bytes":$TRANSCRIPT_BYTES,"transcript_lines":$TRANSCRIPT_LINES,"assistant_turns":$ASSISTANT_TURNS,"tool_uses":$TOOL_USES,"cumulative_input_tokens":$CUMULATIVE_INPUT,"cumulative_input_raw":$CUMULATIVE_INPUT_RAW,"cache_creation_tokens":$CACHE_CREATION,"cache_read_tokens":$CACHE_READ,"output_tokens":$OUTPUT_TOKENS,"energy_wh":$ENERGY_WH,"co2_g":$CO2_G,"cost_cents":$COST_CENTS}
+EOF
+
+exit 0
--- a/impact-toolkit/hooks/show-impact.sh
+++ b/impact-toolkit/hooks/show-impact.sh
@ -0,0 +1,64 @@
+#!/usr/bin/env bash
+#
+# show-impact.sh — Display accumulated impact metrics from the log.
+#
+# Usage: ./show-impact.sh [session_id]
+#   Without arguments: shows summary across all sessions.
+#   With session_id: shows entries for that session only.
+
+set -euo pipefail
+
+PROJECT_DIR="${CLAUDE_PROJECT_DIR:-$(cd "$(dirname "$0")/../.." && pwd)}"
+LOG_FILE="$PROJECT_DIR/.claude/impact/impact-log.jsonl"
+
+if [ ! -f "$LOG_FILE" ]; then
+  echo "No impact log found at $LOG_FILE"
+  echo "The PreCompact hook will create it on first context compaction."
+  exit 0
+fi
+
+FILTER="${1:-.}"
+
+echo "=== Impact Log ==="
+echo ""
+
+while IFS= read -r line; do
+  sid=$(echo "$line" | jq -r '.session_id')
+  if ! echo "$sid" | grep -q "$FILTER"; then
+    continue
+  fi
+
+  ts=$(echo "$line" | jq -r '.timestamp')
+  trigger=$(echo "$line" | jq -r '.trigger')
+  turns=$(echo "$line" | jq -r '.assistant_turns')
+  tools=$(echo "$line" | jq -r '.tool_uses')
+  source=$(echo "$line" | jq -r '.token_source // "heuristic"')
+  cum_input=$(echo "$line" | jq -r '.cumulative_input_tokens')
+  # Support both old field name and new field name
+  output=$(echo "$line" | jq -r '.output_tokens // .estimated_output_tokens')
+  cache_create=$(echo "$line" | jq -r '.cache_creation_tokens // 0')
+  cache_read=$(echo "$line" | jq -r '.cache_read_tokens // 0')
+  energy=$(echo "$line" | jq -r '.energy_wh')
+  co2=$(echo "$line" | jq -r '.co2_g')
+  cost=$(echo "$line" | jq -r '.cost_cents')
+
+  printf "%s [%s] session=%s\n" "$ts" "$trigger" "${sid:0:12}..."
+  printf "  Turns: %s  Tool uses: %s  Token source: %s\n" "$turns" "$tools" "$source"
+  printf "  Input tokens (cache-weighted): %s  Output tokens: %s\n" "$cum_input" "$output"
+  if [ "$cache_create" != "0" ] || [ "$cache_read" != "0" ]; then
+    printf "  Cache: %s created, %s read\n" "$cache_create" "$cache_read"
+  fi
+  LC_NUMERIC=C printf "  Energy: ~%s Wh  CO2: ~%sg  Cost: ~\$%.2f\n" "$energy" "$co2" "$(echo "$cost / 100" | bc -l 2>/dev/null || echo "$cost cents")"
+  echo ""
+done < "$LOG_FILE"
+
+# Totals
+TOTAL_ENERGY=$(jq -s '[.[].energy_wh] | add' "$LOG_FILE")
+TOTAL_CO2=$(jq -s '[.[].co2_g] | add' "$LOG_FILE")
+TOTAL_COST=$(jq -s '[.[].cost_cents] | add' "$LOG_FILE")
+TOTAL_ENTRIES=$(wc -l < "$LOG_FILE")
+
+echo "=== Totals ($TOTAL_ENTRIES snapshots) ==="
+LC_NUMERIC=C printf "  Energy: ~%s Wh  CO2: ~%sg  Cost: ~\$%.2f\n" \
+  "$TOTAL_ENERGY" "$TOTAL_CO2" \
+  "$(echo "$TOTAL_COST / 100" | bc -l 2>/dev/null || echo "$TOTAL_COST cents")"
--- a/impact-toolkit/install.sh
+++ b/impact-toolkit/install.sh
@ -0,0 +1,83 @@
+#!/usr/bin/env bash
+#
+# install.sh — Install the impact tracking toolkit for Claude Code.
+#
+# Copies hook scripts and configures the PreCompact hook in your
+# Claude Code settings. Safe to run multiple times (idempotent).
+#
+# Usage: ./install.sh [--user | --project]
+#   --user     Install to user-level settings (~/.claude/settings.json)
+#   --project  Install to project-level settings (.claude/settings.json)
+#   Default: --project
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+SCOPE="${1:---project}"
+
+# Check dependencies
+if ! command -v jq &>/dev/null; then
+  echo "Error: jq is required but not installed."
+  echo "Install it with: apt install jq / brew install jq / etc."
+  exit 1
+fi
+
+if ! command -v python3 &>/dev/null; then
+  echo "Error: python3 is required for token extraction."
+  echo "Install Python 3 or ensure it is on your PATH."
+  exit 1
+fi
+
+# Determine target directories
+if [ "$SCOPE" = "--user" ]; then
+  SETTINGS_DIR="$HOME/.claude"
+  HOOKS_DIR="$SETTINGS_DIR/hooks"
+  echo "Installing to user-level settings ($SETTINGS_DIR)"
+else
+  # Project-level: use current working directory
+  SETTINGS_DIR="$(pwd)/.claude"
+  HOOKS_DIR="$SETTINGS_DIR/hooks"
+  echo "Installing to project-level settings ($SETTINGS_DIR)"
+fi
+
+# Create directories
+mkdir -p "$HOOKS_DIR"
+mkdir -p "$SETTINGS_DIR/impact"
+
+# Copy hook scripts
+cp "$SCRIPT_DIR/hooks/pre-compact-snapshot.sh" "$HOOKS_DIR/"
+cp "$SCRIPT_DIR/hooks/show-impact.sh" "$HOOKS_DIR/"
+chmod +x "$HOOKS_DIR/pre-compact-snapshot.sh"
+chmod +x "$HOOKS_DIR/show-impact.sh"
+
+echo "Copied hook scripts to $HOOKS_DIR"
+
+# Configure settings.json
+SETTINGS_FILE="$SETTINGS_DIR/settings.json"
+HOOK_CMD="$HOOKS_DIR/pre-compact-snapshot.sh"
+
+if [ -f "$SETTINGS_FILE" ]; then
+  # Check if PreCompact hook already configured
+  if jq -e '.hooks.PreCompact' "$SETTINGS_FILE" &>/dev/null; then
+    echo "PreCompact hook already configured in $SETTINGS_FILE — skipping."
+  else
+    # Add hooks to existing settings
+    jq --arg cmd "$HOOK_CMD" \
+      '.hooks.PreCompact = [{"hooks": [{"type": "command", "command": $cmd}]}]' \
+      "$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
+    echo "Added PreCompact hook to $SETTINGS_FILE"
+  fi
+else
+  # Create new settings file
+  jq -n --arg cmd "$HOOK_CMD" \
+    '{"hooks": {"PreCompact": [{"hooks": [{"type": "command", "command": $cmd}]}]}}' \
+    > "$SETTINGS_FILE"
+  echo "Created $SETTINGS_FILE with PreCompact hook"
+fi
+
+echo ""
+echo "Installation complete."
+echo "Impact metrics will be logged to $SETTINGS_DIR/impact/impact-log.jsonl"
+echo "on each context compaction."
+echo ""
+echo "To view accumulated impact: $HOOKS_DIR/show-impact.sh"