Ralph Wiggum Loop Protocol
Created: 2026-01-06 Updated: 2026-01-21 Author: Ray + Claude Status: Tested & Working
What is the Ralph Wiggum Technique?
The Ralph Wiggum technique (popularized by Geoffrey Huntley) is an iterative AI development methodology where the same prompt is run repeatedly until objective completion criteria are met. The "self-referential" part is that each iteration sees its previous work in files and git history, not that model output is fed back as input.
There are two common implementations:
- External process loop (fresh context): a bash
whileloop runs a newclaude -pprocess each iteration. Each run starts with empty conversational context and relies on state files + the repo. - In-session stop-hook loop (persistent context): Claude Code’s official
ralph-loopplugin uses a Stop hook to block exits and re-feed the same prompt inside a single session.
Why It Works
- Same prompt, repeated = Iteration beats one-shot perfection
- State tracked in files = Progress persists across iterations
- Atomic commits = Easy to audit, revert, or cherry-pick
- Sandboxed branch = Safe experimentation
Key Quote
"Deterministically bad in an undeterministic world" - failures are predictable, enabling systematic improvement through prompt tuning.
Prerequisites
Tools Required
# Claude Code CLI
npm install -g @anthropic-ai/claude-code
# tmux (for persistent sessions)
brew install tmux # macOS
apt install tmux # Linux
# jq (required by Claude Code ralph-loop stop hook)
brew install jq # macOS
apt install jq # Linux
# uv (Python package manager) - if doing Python work
curl -LsSf https://astral.sh/uv/install.sh | sh
CRITICAL: ANTHROPIC_API_KEY and Billing
Claude Code billing depends on authentication, NOT on whether you use -p (headless) or interactive mode.
ANTHROPIC_API_KEY in shell? |
Claude Code uses... | Cost |
|---|---|---|
| YES (exported in ~/.zshrc) | API credits | Pay-per-use (~$6/day) |
| NO (only in .env for Python) | Subscription | Included in Pro/Max |
To use your Pro/Max subscription for Ralph loops:
# 1. Check if ANTHROPIC_API_KEY is in your shell environment
env | grep ANTHROPIC_API_KEY
# 2. If found, remove it from ~/.zshrc (or ~/.bashrc)
# Keep it ONLY in your project's .env file for Python apps
# 3. Verify it's gone (start a new terminal first)
env | grep ANTHROPIC_API_KEY # Should return nothing
# 4. Now Claude Code will use your subscription!
claude -p "Hello" # Uses Pro/Max, not API credits
Why this matters:
- The .env file is loaded by Python apps (via python-dotenv), NOT by your shell
- Claude Code CLI reads from your shell environment
- If you need ANTHROPIC_API_KEY for Python apps (like kalshi agent analyze), keep it in .env only
- This gives you the best of both worlds: free Claude Code + working Python integrations
Project Requirements
- State file (e.g.,
PROGRESS.md) - Tracks what's done/pending - Prompt file (e.g.,
PROMPT.md) - Instructions for each iteration - Specs/Bugs docs - Detailed requirements for each task
- Git repo - For atomic commits and history
- Scratchpad dir (e.g.,
.agent/) - Long-term plan/logs without bloating the prompt
Setup Protocol
Step 1: Create Branch Structure
CRITICAL: Always sandbox Ralph work in a dedicated branch.
# Start from main
git checkout main
git pull origin main
# Optional: create dev branch (integration branch)
git checkout -b dev # optional
# Create Ralph branch (all autonomous work happens here)
# Use a repo-scoped name to avoid collisions across repos (recommended: kalshi-ralph).
git checkout -b kalshi-ralph # if you skipped dev, create this off main
# Push branches to remote for backup
git push -u origin dev # optional
git push -u origin kalshi-ralph
Branch hierarchy (recommended):
main (protected, production)
└── dev (integration, manual merges)
└── kalshi-ralph (autonomous work)
If you don't use a dev branch, use kalshi-ralph directly off main and merge via PR.
Step 1.5: File Placement (Permanent Root Pattern)
File locations:
State files live permanently in root for simplicity:
/ # Project root
├── PROGRESS.md # State file (permanent)
├── PROMPT.md # Loop prompt (permanent)
└── docs/_ralph-wiggum/
└── protocol.md # This file (reference doc)
Why permanent root?
- Moving files back and forth is unnecessary friction
- State files need to be read by every iteration
- .gitignore can exclude them from certain branches if needed
- No copy/move commands to remember
Step 2: Create State File (PROGRESS.md)
This is the brain of the loop. Each iteration reads this to find the next task.
# Project Name - Progress Tracker
**Last Updated:** YYYY-MM-DD
**Purpose:** State file for Ralph Wiggum loop
---
## Phase 1: Critical Fixes
- [ ] **BUG-001**: Description → See `docs/_bugs/BUG-001.md`
- [ ] **BUG-002**: Description → See `docs/_bugs/BUG-002.md`
## Phase 2: Features
- [ ] **SPEC-001**: Description → See `docs/_specs/SPEC-001.md`
- [ ] **SPEC-002**: Description → See `docs/_specs/SPEC-002.md`
## Phase 3: Verification
- [ ] **FINAL-001**: All tests pass
- [ ] **FINAL-002**: All quality gates pass
---
## Work Log
- YYYY-MM-DD: Short entry (what changed + commands run)
---
## Completion Criteria
When ALL boxes are checked, the project is complete.
**Note:** Do NOT include a magic completion phrase here. The loop operator verifies
completion by checking this file's state (all `[x]`), not by parsing output strings.
This prevents reward hacking where the model outputs completion phrases prematurely.
Step 3: Create Prompt File (PROMPT.md)
This is fed to Claude each iteration. Key elements:
# Project - Ralph Wiggum Loop Prompt
You are completing [PROJECT]. This prompt runs headless via:
\`\`\`bash
while true; do
claude --dangerously-skip-permissions -p "$(cat PROMPT.md)"
sleep 2
done
\`\`\`
## First Action: Read State
**IMMEDIATELY** read state files:
\`\`\`bash
cat PROGRESS.md
cat docs/_bugs/README.md
\`\`\`
## Your Task This Iteration
1. Find the **FIRST** unchecked `[ ]` item in PROGRESS.md
2. Complete that ONE item fully
3. Check off the item: `[ ]` → `[x]`
4. **RUN QUALITY GATES** (all must pass)
5. **ATOMIC COMMIT** (see format below)
6. **VERIFY** no unstaged changes remain
7. Exit
**DO NOT** attempt multiple tasks. One task per iteration.
**DO NOT** exit without committing.
**DO NOT** exit with unstaged changes.
## Before Exit Checklist (MANDATORY)
\`\`\`bash
# 1. Run ALL quality gates
uv run ruff check . # Fix any issues
uv run ruff format . # Auto-format
uv run mypy src/ # Fix type errors
uv run pytest tests/unit -v # All tests pass
# 2. Stage ALL changes
git add -A
# 3. Verify nothing unstaged
git status # Should show all staged or clean
# 4. Commit
git commit -m "[TASK-ID] Brief description"
\`\`\`
**If ANY step fails:** Fix it before exiting. Never exit with failing gates or unstaged changes.
## Atomic Commit Format
\`\`\`bash
git add -A && git commit -m "[TASK-ID] Type: description
- What was done
- Tests added/updated
- Quality gates passed"
\`\`\`
## Quality Gates (MUST PASS)
\`\`\`bash
uv run pre-commit run --all-files
uv run ruff check . # Lint
uv run ruff format --check . # Format
uv run mypy src/ # Types
uv run pytest tests/ -v # Tests
\`\`\`
## Guardrails
1. ONE task per iteration
2. Tests first (TDD)
3. Quality gates must pass
4. Read PROGRESS.md first
5. Mark task complete before exit
6. Commit before exit
7. Follow specs exactly
## Completion
When ALL items in PROGRESS.md are checked AND quality gates pass, exit cleanly.
**CRITICAL:** Do not claim completion prematurely. The loop operator verifies
via PROGRESS.md state, not by parsing your output for magic phrases.
Step 3.5: Critical Review Prompt (Mandatory)
Before changing code/docs based on feedback (human reviews, CodeRabbit, another model, your own prior output), apply this block and validate claims against SSOT:
Review the claim or feedback (it may be from an internal or external agent). Validate every claim from first principles. If—and only if—it’s true and helpful, update the system to align with the SSOT, implemented cleanly and completely (Rob C. Martin discipline). Find and fix all half-measures, reward hacks, and partial fixes if they exist. Be critically adversarial with good intentions for constructive criticism. Ship the exact end-to-end implementation we need.
Step 4: Create Spec/Bug Docs
Each task should have a detailed spec:
docs/
├── _bugs/
│ ├── README.md # Summary of all bugs
│ ├── BUG-001.md # Detailed bug description
│ └── BUG-002.md
├── _specs/
│ ├── README.md # Summary of all specs
│ ├── SPEC-001.md # Detailed spec
│ └── SPEC-002.md
└── _ralph-wiggum/
└── protocol.md # This file
Step 5: Start tmux Session
# Create named session (recommended: repo-scoped name to avoid collisions)
tmux new-session -s kalshi-ralph
# Or attach to existing
tmux attach -t kalshi-ralph
# Detach without killing: Ctrl+B, then D
# Kill session: tmux kill-session -t kalshi-ralph
Step 6: Run the Loop
Inside tmux, choose ONE of these approaches:
Option A (Recommended): Simple YOLO Loop
The original Huntley approach - simple, effective, works anywhere:
# Navigate to project
cd /path/to/project
git checkout kalshi-ralph
# THE CLASSIC RALPH LOOP
while true; do
claude --dangerously-skip-permissions -p "$(cat PROMPT.md)"
sleep 2
done
With iteration limit and state-based completion (recommended):
MAX=50
for i in $(seq 1 $MAX); do
echo "=== Iteration $i/$MAX ==="
claude --dangerously-skip-permissions -p "$(cat PROMPT.md)"
# Check state file instead of parsing output (prevents reward hacking)
if ! grep -q "^\- \[ \]" PROGRESS.md; then
echo "✅ All tasks complete!"
break
fi
sleep 2
done
Option B: Granular Permissions (More Conservative)
If you want finer control over what Claude can do:
MAX=50
for i in $(seq 1 $MAX); do
echo "=== Iteration $i/$MAX ==="
claude -p "$(cat PROMPT.md)" \
--allowedTools "Read,Write,Edit,Glob,Grep,Bash"
# Check state file instead of parsing output
if ! grep -q "^\- \[ \]" PROGRESS.md; then
echo "✅ All tasks complete!"
break
fi
sleep 2
done
Even more restrictive (specific bash commands only):
claude -p "$(cat PROMPT.md)" \
--allowedTools "Read,Write,Edit,Glob,Grep,Bash(git:*),Bash(uv:*),Bash(rg:*)"
Option C: Claude Code Plugin (In-Session)
If you prefer the official plugin approach:
cd /path/to/project
git checkout kalshi-ralph
claude # Start interactive session
Then inside Claude Code:
/ralph-loop "See PROMPT.md. Follow it exactly." --max-iterations 20
To cancel: /cancel-ralph
Note: Plugin state lives in .claude/ralph-loop.local.md.
⚠️ Warning: The plugin supports --completion-promise flags, but we recommend
against using them. Rely on state-file verification instead to prevent reward hacking.
Option D: Convenience Script (Recommended for This Repo)
This repo includes a ready-to-use script at scripts/ralph-loop.sh:
# Start (creates or re-attaches) in tmux session "kalshi-ralph"
./scripts/ralph-loop.sh start
# Optional: show last output
./scripts/ralph-loop.sh status
# Stop
./scripts/ralph-loop.sh stop
Script location: scripts/ralph-loop.sh
Tip: If you have multiple repos using Ralph, override the session name:
RALPH_TMUX_SESSION=some-other-session ./scripts/ralph-loop.sh start
Claude Code CLI Reference
Verified from official docs:
| Flag | Description |
|---|---|
-p, --print |
Run non-interactively (headless mode) |
--dangerously-skip-permissions |
Skip ALL permission prompts (YOLO mode) |
--allowedTools "Read,Edit,Bash" |
Auto-approve specific tools |
--allowedTools "Bash(git:*)" |
Pattern match specific commands |
--disallowedTools "Bash(rm:*)" |
Block specific tools/commands |
--output-format text\|json\|stream-json |
Control output format |
--continue |
Continue most recent conversation |
--resume <session_id> |
Resume specific session |
--append-system-prompt "..." |
Add to system prompt |
About --dangerously-skip-permissions:
- Works ANYWHERE (not just Docker) - the Docker recommendation is for safety, not functionality
- Anthropic recommends Docker isolation to prevent data loss/exfiltration
- For Ralph loops in a sandboxed git branch, this is acceptable risk
- Your safety net is git reset --hard or deleting the branch
Stop Conditions
Your loop should stop when:
1. Iteration limit reached - Always set MAX in your loop
2. All tasks complete - Check PROGRESS.md for all [x] markers
3. Manual intervention - Ctrl+C when you're satisfied
IMPORTANT: Avoid magic completion phrases entirely.
Do NOT instruct the model to output phrases like "PROJECT COMPLETE" or use XML tags
like <promise>...</promise>. This creates reward hacking risk where the model
outputs the phrase prematurely to exit the loop.
Instead, verify completion via state:
# Check if all tasks are marked complete
if ! grep -q "^\- \[ \]" PROGRESS.md; then
echo "✅ All tasks complete"
break
fi
This verifies actual progress (checked boxes) rather than trusting model output.
Safety Philosophy
Your real safety net is the sandboxed branch:
- All work happens on kalshi-ralph (or similar)
- Main branch is untouched
- You can always git checkout main && git branch -D kalshi-ralph
- Audit commits before merging
Tool restrictions are optional paranoia. If you're in a sandboxed branch:
- --dangerously-skip-permissions is fine
- Blocking rm would break legitimate cleanup operations
- The worst case is you delete the branch and start over
When to use granular permissions: - Running on production systems (don't do Ralph on prod) - Shared environments where mistakes affect others - When you don't trust your prompt/specs
Rate Limits
- The
sleep 2between iterations helps avoid rate limits - If you hit limits, increase sleep or add exponential backoff
- If the same failure repeats 3+ times, stop and fix the prompt
Monitoring
Watch Progress
# In another terminal/tmux pane
watch -n 5 'head -50 PROGRESS.md'
# Or check git activity
watch -n 5 'git log --oneline -10'
Note: watch is not installed by default on macOS. Use brew install watch or replace with a small while true; do ...; sleep 5; done loop.
Check Loop Status
# See recent commits
git log --oneline -20
# See what changed
git diff HEAD~1
# Check test status
uv run pytest tests/ -v --tb=short
Post-Loop Audit
Review All Changes
# See all commits from Ralph
git log main..kalshi-ralph --oneline
# See full diff
git diff main..kalshi-ralph
# Review specific commit
git show <commit-hash>
Run Quality Gates
uv run ruff check .
uv run ruff format --check .
uv run mypy src/
uv run pytest tests/ -v
Merge if Good
# If everything looks good
git checkout dev
git merge kalshi-ralph
git checkout main
git merge dev
git push origin main
If you skipped dev, open a PR from kalshi-ralph → main, or merge directly after review.
Revert if Bad
# Nuclear option - delete branch entirely
git checkout main
git branch -D kalshi-ralph
# Or revert specific commits
git revert <bad-commit-hash>
Best Practices
DO
- ✅ Always sandbox in dedicated branch (this is your real safety)
- ✅ Use detailed specs for each task
- ✅ Require atomic commits
- ✅ Set clear completion criteria
- ✅ Set iteration limits (
MAX=50or similar) - ✅ Keep the prompt focused and stable
- ✅ Monitor periodically
- ✅ Audit before merging
DON'T
- ❌ Run on main branch (use a sandbox branch!)
- ❌ Skip the state file (PROGRESS.md is the brain)
- ❌ Allow multi-task iterations (one task = one iteration)
- ❌ Trust without auditing
- ❌ Use vague task descriptions
- ❌ Run without iteration limits (infinite loops burn money)
Prompt Tuning Tips
- Be explicit - Claude follows instructions literally
- One task rule - Prevents context overload
- Quality gates - Catch issues early
- Read first - Always read state before acting
- Atomic commits - Easy rollback if needed
- Escape hatch - If stuck after N tries, require a blocking report + stop
Example: Kalshi Research Platform
This protocol was used to build the Kalshi prediction market research platform.
Initial State
- Core platform built (SPEC-001 through SPEC-004)
- 185 tests passing, 81% coverage
- Several bugs and features remaining
Ralph Loop Results
| Commit | Task | Time |
|---|---|---|
9e4e55e |
BUG-007: Fix CI/CD | ~2 min |
7a03b97 |
QUALITY-001: Fix mypy error | ~1 min |
8f9da97 |
QUALITY-002: Fix ruff issues | ~1 min |
5b3ab7c |
QUALITY-003: Verify gates | ~1 min |
394719f |
SPEC-005: Alerts module | ~5 min |
9feab0e |
SPEC-006: Correlation analysis | ~5 min |
9ecefc6 |
SPEC-007: Visualization | ~5 min |
| ... | SPEC-008, FINAL-* | ongoing |
Files Created
PROMPT.md # Loop prompt
PROGRESS.md # State tracking
docs/_bugs/ # Bug documentation
docs/_specs/ # Spec documentation
docs/_ralph-wiggum/ # This protocol
Key Learnings
- Fresh context can work - External loops start clean and reduce drift
- State files are critical - PROGRESS.md is the brain
- Atomic commits enable auditing - Easy to review each change
- Sandboxing is essential - Never risk main branch
- TDD keeps quality high - Tests catch regressions
Troubleshooting
Loop Stops Unexpectedly
# Check if Claude is running
ps aux | grep claude
# Check tmux session
tmux list-sessions
# Restart loop
tmux attach -t kalshi-ralph
# Then re-run the while loop
Claude Gets Stuck
- Check PROGRESS.md for unclear tasks
- Add more detail to the spec doc
- Kill current iteration (Ctrl+C)
- Loop will restart with fresh context
Permission Prompts Block Autonomy
Use --dangerously-skip-permissions to bypass all prompts:
claude --dangerously-skip-permissions -p "$(cat PROMPT.md)"
Or use --allowedTools for granular control:
claude -p "$(cat PROMPT.md)" --allowedTools "Read,Write,Edit,Bash"
Quality Gates Failing
- Loop should auto-fix on next iteration
- If persistent, check the spec for issues
- May need to manually intervene
Merge Conflicts
# On ralph branch
git fetch origin
git rebase origin/main
# Resolve conflicts
git rebase --continue
References
- Geoffrey Huntley - Ralph Wiggum
- Anthropic Claude Code Plugins Directory
- Anthropic
ralph-loopplugin - RepoMirror field report: “We Put a Coding Agent in a While Loop…”
- Ralph Orchestrator
- Ralph for Claude Code (community, adds rate limits/circuit breakers)
- Claude Code documentation
- Claude Code plugins documentation
Quick Start Checklist
# 1. Create sandbox branch
git checkout dev && git pull
git checkout -b kalshi-ralph
# 2. Ensure PROGRESS.md and PROMPT.md exist in root
# (They should already be there - permanent location)
ls PROGRESS.md PROMPT.md
# 3. Ensure spec docs exist for each task (docs/_bugs/, docs/_debt/, docs/_specs/, docs/_future/)
# 4. Start the loop (creates/attaches tmux session: kalshi-ralph)
./scripts/ralph-loop.sh start
# 5. Monitor in another pane (optional)
watch -n 5 'git log --oneline -10'
# 6. Audit when done
git log dev..kalshi-ralph --oneline
git diff dev..kalshi-ralph --stat
# 7. Merge if good (after review)
git checkout dev && git merge kalshi-ralph
# Or open PR for review