Context Foundry is a self-hosted AI agent pipeline that runs your build loop, learns from its mistakes, and gets better every ride.
Rust CLI · v0.7.0 · MIT License
Context Foundry takes a task queue and drives each task through a four-stage pipeline: Scout reads the territory, Plan draws the route, Implement drives the herd, and Doubt inspects every steer at the gate. No agent shares a context window with any other. Each one starts clean and receives only curated artifacts from the previous stage -- a structured report, a deterministic plan, a set of build claims. No bloated conversation history, no accumulated reasoning, no inherited blind spots. Each stage gets signal, not noise. This is how the loop prevents compounding errors across a long task queue. The system extracts patterns from solved problems and feeds them back into future plans, so it gets better at your codebase over time. It runs on your machine, calls the AI providers you choose, and commits verified code to git. No cloud service. No vendor lock-in. Just a Rust binary and a terminal.
Every task follows the trail from start to pen. The SPID pipeline has four stages plus a Ship step that handles git. Two ancillary stages -- Discovery and Pattern Extraction -- run outside the per-task trail.
The scout rides ahead and maps the territory before the herd moves. It reads TASKS.md, SPEC.md, and your source files, then writes a report of what exists and what matters for the current task.
| Aspect | Detail |
|---|---|
| Agent | AgentRole::Scout |
| Input | TASKS.md, SPEC.md, source files |
| Output | .buildloop/scout-report.md |
| Skip when | Fast mode and scout-report.md already exists |
The planner reads the scout report, loads matching patterns from the brand book, and writes a step-by-step implementation plan with exact file operations, function signatures, and verification commands. No ambiguity -- the plan is for a machine, not a human.
| Aspect | Detail |
|---|---|
| Agent | AgentRole::Planner |
| Input | scout-report.md, matched patterns, extension context |
| Output | .buildloop/current-plan.md |
| Skip when | skip_planner_for_simple is true and task complexity is Simple |
The builder takes the plan and drives the cattle down the trail. It creates and modifies files, runs the build, runs tests, and records what it claims to have done. In dual mode, two builders fork into separate worktrees and drive independent herds.
| Aspect | Detail |
|---|---|
| Agent | AgentRole::Builder |
| Input | .buildloop/current-plan.md |
| Output | .buildloop/build-claims.md |
| Dual mode | Parallel git worktrees under .buildloop/arena/ |
The trail boss inspects every steer at the gate before they enter the pen. A fresh-context reviewer reads the builder's claims, re-runs verification, and fixes issues it finds. If everything passes, the task ships as feat(). If not, it ships as WIP() and the herd keeps moving.
| Aspect | Detail |
|---|---|
| Agent | Reviewer + Fixer (fresh context) |
| Input | .buildloop/build-claims.md |
| Output | .buildloop/review-report.md |
| On PASS | Commit as feat(task_id): description |
| On FAIL | Fixer runs (2 passes), then commit as WIP(task_id): description |
| Skip when | backpressure_only is true, or batch_doubt defers to last task |
Four ways to run the outfit. Each ranch hand has their own style. Cycle between the first three with Ctrl+M. The fourth is controlled by Ctrl+D -- it works alongside any of the other three.
Processes every task in TASKS.md, then runs Discovery to find more work. When the trail ends, it finds a new trail. Never stops unless you tell it to.
| run_mode | "auto" |
| Task source | TASKS.md + Discovery appends more |
| Review gate | None -- auto-commits after Doubt |
| Commit | feat() on pass, WIP() on fail |
| Hotkey | Ctrl+M (cycles to Sprint) |
Burns through TASKS.md and stops. No Discovery, no new trails. You told it what to do and it does exactly that.
| run_mode | "sprint" |
| Task source | TASKS.md only -- no Discovery |
| Review gate | None -- auto-commits after Doubt |
| Commit | feat() on pass, WIP() on fail |
| Hotkey | Ctrl+M (cycles to Review) |
One task at a time. After Doubt, it creates a PR on a foundry/{task_id} branch and waits for human approval. Press Enter to continue or let the PR poller detect approval.
| run_mode | "review" |
| Task source | TASKS.md, one task at a time |
| Review gate | Human gate -- waits for Enter or PR approval |
| Commit | Per-task PR on foundry/{task_id} branch |
| PR poller | Every pr_poll_interval_secs (default 30s) |
| Hotkey | Ctrl+M (cycles to Auto) |
Orthogonal to the modes above. When builder_models has 2+ entries, Ctrl+D cycles through which model(s) execute: Off, First only, Second only, or Both. Works with any run_mode.
| Control | Ctrl+D (cycles: Off -> First -> Second -> Both -> First) |
| When Both | Two builders fork into parallel git worktrees |
| Selection | Human picks the winner |
| Config | builder_models: ["claude:opus", "codex:o4-mini"] |
Every solved problem gets branded with a pattern_id so the herd can be identified and sorted at the gate. Patterns live in ~/.foundry/patterns/ as JSON files. Before planning each task, matching patterns are loaded and injected into the planner's context. The more cattle you drive, the fewer strays at the gate.
{
"pattern_id": "string -- unique identifier",
"title": "string -- human-readable name",
"severity": "HIGH | MEDIUM | LOW",
"keywords": ["array", "of", "match", "terms"],
"tech_stack": ["rust", "react", "python"],
"issue": "string -- what went wrong",
"solution": {
"planner": "string -- advice injected into planner prompt",
"reviewer": "string -- advice injected into reviewer prompt"
},
"frequency": 42,
"first_seen": "2026-01-15",
"last_seen": "2026-03-17",
"auto_apply": false,
"learned_from": "project-name"
}
At plan time, the system loads all patterns from ~/.foundry/patterns/ and matches them against the current task description using keyword overlap. If semantic_match_enabled is true, Ollama embeddings provide semantic similarity scoring alongside keyword matching. Up to max_pattern_injection (default 10) patterns are injected into the planner and reviewer prompts as reference data.
Local patterns (project-level JSON files) can be promoted to the global registry using merge_project_patterns(), available as an MCP tool. This is how a solved problem in one project prevents the same mistake in every future project.
Two cowboys out of the chute at the same time. The human judge decides who scored.
When builder_models lists two or more model specs (e.g., ["claude:opus", "codex:o4-mini"]), the arena is available. Ctrl+D cycles through four states: Off (single model), First only, Second only, and Both. In Both mode, run_dual_pipelines() forks execution into two independent git worktrees.
.buildloop/arena/pipeline-0-{provider}/ and .buildloop/arena/pipeline-1-{provider}/DualPipelineEvent.buildloop/arena/
pipeline-0-claude/ (git worktree for model[0])
.buildloop/
scout-report.md (copied from main)
current-plan.md (generated by model[0])
build-claims.md (generated by model[0])
review-report.md (generated by model[0])
pipeline-1-codex/ (git worktree for model[1])
.buildloop/
scout-report.md
current-plan.md
build-claims.md
review-report.md
| State | Behavior | Next (Ctrl+D) |
|---|---|---|
| Off (default) | Single-model execution using builder_model/builder_provider | First |
| First | Uses first entry from builder_models array | Second |
| Second | Uses second entry from builder_models array | Both |
| Both | run_dual_pipelines() -- both models in parallel worktrees | First |
Note: After Both, Ctrl+D returns to First, not Off. The cycle is First -> Second -> Both -> First.
Every ranch has a title document. Yours is .foundry.json at the project root. Here are the fields that matter most.
| Field | Type | Default | Description |
|---|---|---|---|
| Execution | |||
run_mode | String | "auto" | Execution mode: "auto", "sprint", or "review" |
pipeline_mode | String | "full" | Pipeline mode: "full", "fast", or "backpressure" |
batch_doubt | bool | true | Defer Doubt to last task only |
backpressure_only | bool | true | Skip LLM review, rely on build/test/lint only |
skip_planner_for_simple | bool | true | Skip planner for Simple-complexity tasks |
| Models & Providers | |||
scout_model | String | "sonnet" | Model name for Scout agent |
scout_provider | String | "claude" | Provider for Scout: "claude" or "codex" |
planner_model | String | "opus" | Model name for Planner agent |
planner_provider | String | "claude" | Provider for Planner: "claude" or "codex" |
builder_model | String | "opus" | Model name for Builder agent |
builder_provider | String | "claude" | Provider for Builder: "claude" or "codex" |
reviewer_model | String | "sonnet" | Model name for Reviewer agent |
reviewer_provider | String | "claude" | Provider for Reviewer: "claude" or "codex" |
fixer_model | String | "sonnet" | Model name for Fixer agent |
fixer_provider | String | "claude" | Provider for Fixer: "claude" or "codex" |
discovery_model | String | "opus" | Model name for Discovery agent |
discovery_provider | String | "claude" | Provider for Discovery: "claude" or "codex" |
pattern_extraction_model | String | "sonnet" | Model for pattern extraction (lightweight JSON output) |
| Simple Task Overrides | |||
simple_planner_model | String | "sonnet" | Planner model override for Simple-complexity tasks |
simple_builder_model | String | "sonnet" | Builder model override for Simple-complexity tasks |
simple_reviewer_model | String | "" | Reviewer model for Simple tasks (empty = use backpressure_only) |
| Dual-Model Arena | |||
builder_models | Vec<String> | [] | Dual-model specs, format "provider:model". Overrides builder_model when len >= 2. |
dual_selection | String | "" (off) | Dual-build selection: "first", "second", "both", or "" (off) |
| Review | |||
review_mode | String | "diff-only" | Review input: "diff-only" passes git diff, "file-list" uses changed file list |
review_multipass_threshold | usize | 8 | File count threshold for multi-pass review (0 = disable) |
confidence_threshold | f64 | 0.5 | Reviewer finding confidence below this is logged, not auto-fixed |
| Timing & Pauses | |||
agent_timeout_secs | u64 | 600 | Max seconds before killing an agent (10 min) |
adaptive_pauses | bool | true | Skip inter-agent sleep when last agent was not rate-limited |
pause_between_tasks_secs | u64 | 10 | Seconds between tasks (skipped when adaptive_pauses and no rate limit) |
pause_between_agents_secs | u64 | 3 | Seconds between pipeline stages |
pause_between_cycles_secs | u64 | 30 | Seconds between discovery cycles |
planner_lookahead | bool | true | Pre-plan task N+1 while builder runs task N |
discovery_cooldown_minutes | u64 | 5 | Minutes to wait before first discovery (doubles on 0-find rounds, max 30) |
| Git & PRs | |||
auto_push_remote | Option<String> | null | Git remote to auto-push after commits (null = local only) |
pr_poll_interval_secs | u64 | 30 | Seconds between PR review status checks |
create_issue_on_wip | bool | false | Create GitHub issue on WIP (failed) commits |
| Patterns & Embeddings | |||
patterns_dir | String | "~/.foundry/patterns" | Directory for global pattern store |
max_pattern_injection | usize | 10 | Max patterns injected into agent prompts |
semantic_match_enabled | bool | true | Enable semantic pattern matching via Ollama embeddings |
embedding_model | String | "nomic-embed-text" | Ollama model for semantic pattern embeddings |
ollama_url | String | "http://127.0.0.1:11435" | Ollama API base URL for embedding requests |
embedding_timeout_ms | u64 | 2000 | Timeout for Ollama embedding requests in milliseconds |
| Orchestrator | |||
orchestrator_proposer_provider | String | "claude" | Provider for orchestrator proposer |
orchestrator_proposer_model | String | "opus" | Model for orchestrator proposer |
orchestrator_reviewer_provider | String | "claude" | Provider for orchestrator reviewer |
orchestrator_reviewer_model | String | "opus" | Model for orchestrator reviewer |
orchestrator_max_iterations | usize | 3 | Max proposer/reviewer iterations |
orchestrator_accept_policy | String | "no-high-medium" | Acceptance: "no-high", "no-high-medium", or "no-findings" |
planning_iterations | u64 | 0 | Max iterations for foundry plan mode (0 = unlimited) |
| Extensions | |||
extensions | Vec<String> | [] | Selected extension names (e.g., ["roblox", "extend"]) |
| Display & Limits | |||
theme | String | "dark" | TUI color theme: "dark", "catppuccin", or "solarized" |
truecolor | Option<bool> | null | Override truecolor detection (null = auto-detect) |
preview_wrap | bool | true | Wrap long lines in build preview panel |
cost_limit | f64 | 0.0 | Max session cost in USD (0.0 = no limit) |
build_command | Option<String> | null | Custom build/compile command run after builder completes |
Specialist deputies, each with their own jurisdiction. Extensions provide domain-specific knowledge to agents. They are discovered from ~/.foundry/extensions/ (global) and <project>/extensions/ (local). Each extension directory contains a CLAUDE.md file whose contents are prepended to agent prompts.
World generation using Lune scripting and .rbxl/.rbxm files. Key files: CLAUDE.md, patterns/roblox-common-issues.json. Gotcha: use add_to_world.luau, never generate_world.luau -- terrain has varying elevations that cause floating objects.
Workday Extend PaaS application development. Key files: CLAUDE.md, WORKDAY_EXTEND_DEVELOPER_GUIDE.md, WORKDAY_EXTEND_ARCHITECTURE.md. Gotcha: apps are metadata-driven with no arbitrary code execution -- WIDs are tenant-specific, always use Reference IDs instead.
Fleet reconnaissance from a management server -- inventory lookups, iDRAC queries, racadm commands against Dell servers. Key files: CLAUDE.md, config/inventory-schema.json. Gotcha: always use grep -w for hostname lookups and label batch output with the current hostname.
Building Flowise AgentFlow v2 workflows. Output is single JSON files. Key files: CLAUDE.md, FLOWISE.md, node-templates/*.json. Gotcha: use HTML span format for all variable references and follow the validation checklist before importing.
Python orchestrator for an AI Minecraft bot that plays autonomously. Monitors bot state, assigns goals, recalls the bot if it wanders too far from spawn. Architecture: client, detector, goals, learner, monitor, orchestrator, persistence, planner modules.
Here is what matters for architects evaluating this system.
.foundry.json file. No infrastructure changes required to change behavior.