Y'all Need a Smarter Build Loop

Context Foundry is a self-hosted AI agent pipeline that runs your build loop, learns from its mistakes, and gets better every ride.

Rust CLI · v0.7.0 · MIT License

Context Foundry takes a task queue and drives each task through a four-stage pipeline: Scout reads the territory, Plan draws the route, Implement drives the herd, and Doubt inspects every steer at the gate. No agent shares a context window with any other. Each one starts clean and receives only curated artifacts from the previous stage -- a structured report, a deterministic plan, a set of build claims. No bloated conversation history, no accumulated reasoning, no inherited blind spots. Each stage gets signal, not noise. This is how the loop prevents compounding errors across a long task queue. The system extracts patterns from solved problems and feeds them back into future plans, so it gets better at your codebase over time. It runs on your machine, calls the AI providers you choose, and commits verified code to git. No cloud service. No vendor lock-in. Just a Rust binary and a terminal.

~:~=~:~=~:~=~:~=~:~=~:~

The Trail Drive

Every task follows the trail from start to pen. The SPID pipeline has four stages plus a Ship step that handles git. Two ancillary stages -- Discovery and Pattern Extraction -- run outside the per-task trail.

TASKS.md SCOUT Reads the territory PLAN Draws the route IMPLEMENT Drives the herd DOUBT Inspects at the gate SHIP Brands and ships git commit DISCOVERY PATTERNS

Scout -- "Rides Ahead"

The scout rides ahead and maps the territory before the herd moves. It reads TASKS.md, SPEC.md, and your source files, then writes a report of what exists and what matters for the current task.

AspectDetail
AgentAgentRole::Scout
InputTASKS.md, SPEC.md, source files
Output.buildloop/scout-report.md
Skip whenFast mode and scout-report.md already exists

Plan -- "Draws the Route"

The planner reads the scout report, loads matching patterns from the brand book, and writes a step-by-step implementation plan with exact file operations, function signatures, and verification commands. No ambiguity -- the plan is for a machine, not a human.

AspectDetail
AgentAgentRole::Planner
Inputscout-report.md, matched patterns, extension context
Output.buildloop/current-plan.md
Skip whenskip_planner_for_simple is true and task complexity is Simple

Implement -- "Drives the Herd"

The builder takes the plan and drives the cattle down the trail. It creates and modifies files, runs the build, runs tests, and records what it claims to have done. In dual mode, two builders fork into separate worktrees and drive independent herds.

AspectDetail
AgentAgentRole::Builder
Input.buildloop/current-plan.md
Output.buildloop/build-claims.md
Dual modeParallel git worktrees under .buildloop/arena/

Doubt -- "The Trail Boss"

The trail boss inspects every steer at the gate before they enter the pen. A fresh-context reviewer reads the builder's claims, re-runs verification, and fixes issues it finds. If everything passes, the task ships as feat(). If not, it ships as WIP() and the herd keeps moving.

AspectDetail
AgentReviewer + Fixer (fresh context)
Input.buildloop/build-claims.md
Output.buildloop/review-report.md
On PASSCommit as feat(task_id): description
On FAILFixer runs (2 passes), then commit as WIP(task_id): description
Skip whenbackpressure_only is true, or batch_doubt defers to last task
*~~~*~~~*~~~*~~~*~~~*~~~*

The Ranch Hands

Four ways to run the outfit. Each ranch hand has their own style. Cycle between the first three with Ctrl+M. The fourth is controlled by Ctrl+D -- it works alongside any of the other three.

The Tireless Hand

Loop Mode -- keeps riding until there's nothing left to do

Processes every task in TASKS.md, then runs Discovery to find more work. When the trail ends, it finds a new trail. Never stops unless you tell it to.

run_mode"auto"
Task sourceTASKS.md + Discovery appends more
Review gateNone -- auto-commits after Doubt
Commitfeat() on pass, WIP() on fail
HotkeyCtrl+M (cycles to Sprint)

The Hard Rider

Sprint Mode -- rides hard, stops when the job's done

Burns through TASKS.md and stops. No Discovery, no new trails. You told it what to do and it does exactly that.

run_mode"sprint"
Task sourceTASKS.md only -- no Discovery
Review gateNone -- auto-commits after Doubt
Commitfeat() on pass, WIP() on fail
HotkeyCtrl+M (cycles to Review)

The Foreman

Review Mode -- won't let anything ship without a sign-off

One task at a time. After Doubt, it creates a PR on a foundry/{task_id} branch and waits for human approval. Press Enter to continue or let the PR poller detect approval.

run_mode"review"
Task sourceTASKS.md, one task at a time
Review gateHuman gate -- waits for Enter or PR approval
CommitPer-task PR on foundry/{task_id} branch
PR pollerEvery pr_poll_interval_secs (default 30s)
HotkeyCtrl+M (cycles to Auto)

The Racing Pair

Dual Mode -- two cowboys racing to see who ropes the steer first

Orthogonal to the modes above. When builder_models has 2+ entries, Ctrl+D cycles through which model(s) execute: Off, First only, Second only, or Both. Works with any run_mode.

ControlCtrl+D (cycles: Off -> First -> Second -> Both -> First)
When BothTwo builders fork into parallel git worktrees
SelectionHuman picks the winner
Configbuilder_models: ["claude:opus", "codex:o4-mini"]
-=<*>=-=<*>=-=<*>=-=<*>=

The Pattern Brand

Every solved problem gets branded with a pattern_id so the herd can be identified and sorted at the gate. Patterns live in ~/.foundry/patterns/ as JSON files. Before planning each task, matching patterns are loaded and injected into the planner's context. The more cattle you drive, the fewer strays at the gate.

The Brand Mark (JSON Schema)

{
  "pattern_id":   "string -- unique identifier",
  "title":        "string -- human-readable name",
  "severity":     "HIGH | MEDIUM | LOW",
  "keywords":     ["array", "of", "match", "terms"],
  "tech_stack":   ["rust", "react", "python"],
  "issue":        "string -- what went wrong",
  "solution": {
    "planner":    "string -- advice injected into planner prompt",
    "reviewer":   "string -- advice injected into reviewer prompt"
  },
  "frequency":    42,
  "first_seen":   "2026-01-15",
  "last_seen":    "2026-03-17",
  "auto_apply":   false,
  "learned_from": "project-name"
}

How Patterns Are Matched

At plan time, the system loads all patterns from ~/.foundry/patterns/ and matches them against the current task description using keyword overlap. If semantic_match_enabled is true, Ollama embeddings provide semantic similarity scoring alongside keyword matching. Up to max_pattern_injection (default 10) patterns are injected into the planner and reviewer prompts as reference data.

Local patterns (project-level JSON files) can be promoted to the global registry using merge_project_patterns(), available as an MCP tool. This is how a solved problem in one project prevents the same mistake in every future project.

~:~=~:~=~:~=~:~=~:~=~:~

The Arena

Two cowboys out of the chute at the same time. The human judge decides who scored.

When builder_models lists two or more model specs (e.g., ["claude:opus", "codex:o4-mini"]), the arena is available. Ctrl+D cycles through four states: Off (single model), First only, Second only, and Both. In Both mode, run_dual_pipelines() forks execution into two independent git worktrees.

How It Works

  1. Two git worktrees are created under .buildloop/arena/pipeline-0-{provider}/ and .buildloop/arena/pipeline-1-{provider}/
  2. Each worktree gets a full copy of the project
  3. Each model runs the full Plan-Implement-Doubt pipeline independently in its own worktree
  4. Events are tagged by pipeline index (0 or 1) via DualPipelineEvent
  5. The TUI shows a tabbed view -- one tab per model -- so you can watch both trails simultaneously
  6. When both finish, the human evaluates and picks the winner

Worktree Layout

.buildloop/arena/
  pipeline-0-claude/     (git worktree for model[0])
    .buildloop/
      scout-report.md    (copied from main)
      current-plan.md    (generated by model[0])
      build-claims.md    (generated by model[0])
      review-report.md   (generated by model[0])
  pipeline-1-codex/      (git worktree for model[1])
    .buildloop/
      scout-report.md
      current-plan.md
      build-claims.md
      review-report.md

Ctrl+D Cycle

StateBehaviorNext (Ctrl+D)
Off (default)Single-model execution using builder_model/builder_providerFirst
FirstUses first entry from builder_models arraySecond
SecondUses second entry from builder_models arrayBoth
Bothrun_dual_pipelines() -- both models in parallel worktreesFirst

Note: After Both, Ctrl+D returns to First, not Off. The cycle is First -> Second -> Both -> First.

*~~~*~~~*~~~*~~~*~~~*~~~*

The Deed

Every ranch has a title document. Yours is .foundry.json at the project root. Here are the fields that matter most.

FieldTypeDefaultDescription
Execution
run_modeString"auto"Execution mode: "auto", "sprint", or "review"
pipeline_modeString"full"Pipeline mode: "full", "fast", or "backpressure"
batch_doubtbooltrueDefer Doubt to last task only
backpressure_onlybooltrueSkip LLM review, rely on build/test/lint only
skip_planner_for_simplebooltrueSkip planner for Simple-complexity tasks
Models & Providers
scout_modelString"sonnet"Model name for Scout agent
scout_providerString"claude"Provider for Scout: "claude" or "codex"
planner_modelString"opus"Model name for Planner agent
planner_providerString"claude"Provider for Planner: "claude" or "codex"
builder_modelString"opus"Model name for Builder agent
builder_providerString"claude"Provider for Builder: "claude" or "codex"
reviewer_modelString"sonnet"Model name for Reviewer agent
reviewer_providerString"claude"Provider for Reviewer: "claude" or "codex"
fixer_modelString"sonnet"Model name for Fixer agent
fixer_providerString"claude"Provider for Fixer: "claude" or "codex"
discovery_modelString"opus"Model name for Discovery agent
discovery_providerString"claude"Provider for Discovery: "claude" or "codex"
pattern_extraction_modelString"sonnet"Model for pattern extraction (lightweight JSON output)
Simple Task Overrides
simple_planner_modelString"sonnet"Planner model override for Simple-complexity tasks
simple_builder_modelString"sonnet"Builder model override for Simple-complexity tasks
simple_reviewer_modelString""Reviewer model for Simple tasks (empty = use backpressure_only)
Dual-Model Arena
builder_modelsVec<String>[]Dual-model specs, format "provider:model". Overrides builder_model when len >= 2.
dual_selectionString"" (off)Dual-build selection: "first", "second", "both", or "" (off)
Review
review_modeString"diff-only"Review input: "diff-only" passes git diff, "file-list" uses changed file list
review_multipass_thresholdusize8File count threshold for multi-pass review (0 = disable)
confidence_thresholdf640.5Reviewer finding confidence below this is logged, not auto-fixed
Timing & Pauses
agent_timeout_secsu64600Max seconds before killing an agent (10 min)
adaptive_pausesbooltrueSkip inter-agent sleep when last agent was not rate-limited
pause_between_tasks_secsu6410Seconds between tasks (skipped when adaptive_pauses and no rate limit)
pause_between_agents_secsu643Seconds between pipeline stages
pause_between_cycles_secsu6430Seconds between discovery cycles
planner_lookaheadbooltruePre-plan task N+1 while builder runs task N
discovery_cooldown_minutesu645Minutes to wait before first discovery (doubles on 0-find rounds, max 30)
Git & PRs
auto_push_remoteOption<String>nullGit remote to auto-push after commits (null = local only)
pr_poll_interval_secsu6430Seconds between PR review status checks
create_issue_on_wipboolfalseCreate GitHub issue on WIP (failed) commits
Patterns & Embeddings
patterns_dirString"~/.foundry/patterns"Directory for global pattern store
max_pattern_injectionusize10Max patterns injected into agent prompts
semantic_match_enabledbooltrueEnable semantic pattern matching via Ollama embeddings
embedding_modelString"nomic-embed-text"Ollama model for semantic pattern embeddings
ollama_urlString"http://127.0.0.1:11435"Ollama API base URL for embedding requests
embedding_timeout_msu642000Timeout for Ollama embedding requests in milliseconds
Orchestrator
orchestrator_proposer_providerString"claude"Provider for orchestrator proposer
orchestrator_proposer_modelString"opus"Model for orchestrator proposer
orchestrator_reviewer_providerString"claude"Provider for orchestrator reviewer
orchestrator_reviewer_modelString"opus"Model for orchestrator reviewer
orchestrator_max_iterationsusize3Max proposer/reviewer iterations
orchestrator_accept_policyString"no-high-medium"Acceptance: "no-high", "no-high-medium", or "no-findings"
planning_iterationsu640Max iterations for foundry plan mode (0 = unlimited)
Extensions
extensionsVec<String>[]Selected extension names (e.g., ["roblox", "extend"])
Display & Limits
themeString"dark"TUI color theme: "dark", "catppuccin", or "solarized"
truecolorOption<bool>nullOverride truecolor detection (null = auto-detect)
preview_wrapbooltrueWrap long lines in build preview panel
cost_limitf640.0Max session cost in USD (0.0 = no limit)
build_commandOption<String>nullCustom build/compile command run after builder completes
-=<*>=-=<*>=-=<*>=-=<*>=

The Posse

Specialist deputies, each with their own jurisdiction. Extensions provide domain-specific knowledge to agents. They are discovered from ~/.foundry/extensions/ (global) and <project>/extensions/ (local). Each extension directory contains a CLAUDE.md file whose contents are prepended to agent prompts.

Roblox

extensions/roblox/

World generation using Lune scripting and .rbxl/.rbxm files. Key files: CLAUDE.md, patterns/roblox-common-issues.json. Gotcha: use add_to_world.luau, never generate_world.luau -- terrain has varying elevations that cause floating objects.

Workday Extend

extensions/extend/

Workday Extend PaaS application development. Key files: CLAUDE.md, WORKDAY_EXTEND_DEVELOPER_GUIDE.md, WORKDAY_EXTEND_ARCHITECTURE.md. Gotcha: apps are metadata-driven with no arbitrary code execution -- WIDs are tenant-specific, always use Reference IDs instead.

Recon

extensions/recon/

Fleet reconnaissance from a management server -- inventory lookups, iDRAC queries, racadm commands against Dell servers. Key files: CLAUDE.md, config/inventory-schema.json. Gotcha: always use grep -w for hostname lookups and label batch output with the current hostname.

Flowise

extensions/flowise/

Building Flowise AgentFlow v2 workflows. Output is single JSON files. Key files: CLAUDE.md, FLOWISE.md, node-templates/*.json. Gotcha: use HTML span format for all variable references and follow the validation checklist before importing.

Mindcraft

extensions/mindcraft/

Python orchestrator for an AI Minecraft bot that plays autonomously. Monitors bot state, assigns goals, recalls the bot if it wanders too far from spawn. Architecture: client, detector, goals, learner, monitor, orchestrator, persistence, planner modules.

~:~=~:~=~:~=~:~=~:~=~:~

Trail's End

Here is what matters for architects evaluating this system.