Mapped against the Claude Certified Architect -- Foundations Exam Guide (Anthropic, 2025)
| Task | Principle | Status | Evidence |
|---|---|---|---|
| 1.1 | Agentic loop lifecycle (stop_reason, tool_use vs end_turn) | Implemented | agent.rs:38 -- parses stream-json events, loops on tool_use |
| 1.1 | Model-driven decision-making vs pre-configured decision trees | Implemented | Agents choose their own tools; foundry provides role prompts, not tool sequences |
| 1.1 | Avoid anti-patterns: NL parsing for termination, arbitrary iteration caps | Implemented | Uses stream-json structured events. Timeout is a safety net, not a stop mechanism |
| 1.2 | Hub-and-spoke coordinator with isolated subagent context | Implemented | app/build.rs orchestrates agents as isolated CLI invocations with no shared history |
| 1.2 | Coordinator handles decomposition, delegation, result aggregation | Implemented | TASKS.md for decomposition, stage agents for delegation, .buildloop/ for aggregation |
| 1.3 | Subagent context must be explicitly provided, not inherited | Implemented | Each agent is a fresh CLI process. Artifacts passed as file references |
| 1.3 | AgentDefinition with descriptions, prompts, tool restrictions | Implemented | prompts.rs per-role prompts; agent.rs:562 allowed_tools per invocation |
| 1.3 | Fork-based session management | N/A | Foundry spawns fresh processes, not Claude Code sessions |
| 1.4 | Programmatic enforcement (hooks, prerequisite gates) | Implemented | gate_builder and gate_reviewer in build.rs:73,96. Extension gate at build.rs:1513 |
| 1.4 | Deterministic compliance for critical operations | Implemented | Gates are code, not prompts. Plan must have File Operations + Verification sections |
| 1.4 | Structured handoff protocols between stages | Implemented | Each stage writes structured .buildloop/ artifacts consumed by downstream stages |
| 1.5 | Agent SDK hooks (PostToolUse) for tool call interception | N/A | Uses Claude Code CLI, not Agent SDK |
| 1.5 | Hooks for deterministic guarantees vs prompt-based compliance | Partial | Gates enforce stage ordering; no tool-call-level interception within a session |
| 1.6 | Fixed sequential pipelines vs dynamic adaptive decomposition | Implemented | Fixed pipeline with adaptive elements (complexity-based skip, retry-with-feedback) |
| 1.6 | Prompt chaining for multi-step workflows | Implemented | Stages chain via file artifacts. Reviewer findings chain into fixer |
| 1.6 | Adaptive investigation plans based on discoveries | Partial | Discovery agent adapts task generation, but agents don't spawn sub-investigations |
| 1.7 | Named session resumption (--resume) | N/A | Agents are stateless one-shot invocations. State lives in .buildloop/ files |
| 1.7 | fork_session for parallel exploration | N/A | Not applicable -- foundry doesn't use Claude Code sessions |
| 1.7 | Crash recovery via structured state persistence | Implemented | .buildloop/ artifacts + TASKS.md SPID progress survive crashes |
| Task | Principle | Status | Evidence |
|---|---|---|---|
| 2.1 | Clear tool descriptions with input formats and boundaries | N/A | Agents use Claude Code's built-in tools, not custom MCP tools |
| 2.2 | Structured error responses (isError, errorCategory, isRetryable) | N/A | MCP tools exist for external use, not internal agent orchestration |
| 2.3 | Scoped tool access per agent role | Implemented | agent.rs:562 allowed_tools. Skills restrict to 3-4 tools each |
| 2.3 | Too many tools degrades selection reliability | Implemented | Reviewer is read-only. Skills restrict to role-appropriate tools |
| 2.4 | MCP server scoping (project vs user level) | N/A | Foundry doesn't configure MCP servers for its agents |
| 2.4 | MCP resources as content catalogs | Implemented | mcp.rs:125 pattern catalog + mcp.rs:131 extension index as browsable MCP resources via foundry:// URIs |
| 2.5 | Effective use of built-in tools (Read, Write, Edit, Bash, Grep, Glob) | Implemented | Agent prompts guide tool selection per role |
| Task | Principle | Status | Evidence |
|---|---|---|---|
| 3.1 | CLAUDE.md hierarchy (user > project > directory) | Implemented | Agents inherit CLAUDE.md via normal loading. Foundry appends orchestration override |
| 3.1 | .claude/rules/ for path-scoped conventions | Implemented | 6 rule files with paths: frontmatter scoping |
| 3.1 | @import for modular CLAUDE.md | Not Used | Rules already split into .claude/rules/. Could be useful for extensions |
| 3.2 | Custom slash commands in .claude/commands/ | Not Used | Skills in .claude/skills/ serve this purpose instead |
| 3.2 | Skills with SKILL.md, context: fork, allowed-tools | Implemented | 3 skills (audit, scout, extract-patterns) with fork context and scoped tools |
| 3.3 | Path-specific rules with YAML frontmatter | Implemented | All 6 rule files use paths: frontmatter for conditional loading |
| 3.4 | Plan mode vs direct execution | Implemented | Planner stage IS plan mode. Complexity classifier can skip for simple tasks |
| 3.5 | Iterative refinement with concrete I/O examples | Implemented | Few-shot severity examples in reviewer. JSON template in pattern extractor |
| 3.5 | Test-driven iteration | Implemented | Builder runs tests. Reviewer re-runs independently. Fixer iterates on failures |
| 3.6 | CI/CD integration (--output-format json) | Implemented | --output-format json on foundry run --no-tui. SessionReport with tasks/session/config. Schema: docs/ci-output-schema.json |
| 3.6 | Session context isolation -- fresh reviewer | Implemented | Core design principle. Verify agent is a completely separate CLI invocation |
| Task | Principle | Status | Evidence |
|---|---|---|---|
| 4.1 | Explicit criteria over vague instructions | Implemented | Reviewer defines severity criteria with examples. "What to report" and "what to skip" lists |
| 4.1 | Explicit criteria reduce false positives | Implemented | Categorical criteria, not confidence-based filtering |
| 4.2 | Few-shot examples for output consistency | Implemented | Reviewer severity examples. Pattern extractor JSON template. Build-claims format |
| 4.2 | Few-shot for ambiguous-case handling | Implemented | prompts.rs:572-602 three borderline severity examples: unchecked file read = HIGH, test-only return value = LOW, unwrap on constant = SKIP |
| 4.3 | Structured output via tool_use with JSON schemas | N/A | Uses Claude Code CLI, not the API |
| 4.4 | Retry-with-error-feedback | Implemented | Gate failure triggers planner retry with validation error appended. Agent timeout retry |
| 4.4 | Feedback loops -- tracking which patterns trigger findings | Implemented | "Applied" counter tracks patterns in agent output. Frequency 3+ auto-promotes |
| 4.5 | Batch processing (Message Batches API) | N/A | Sequential processing via CLI, not API batch endpoint |
| 4.6 | Multi-instance review -- independent reviewer | Implemented | Core architecture. Fresh CLI invocation with zero shared context from builder |
| 4.6 | Multi-pass review (per-file + cross-file integration) | Implemented | review.rs:269 run_multipass_review splits into per-file analysis + cross-file integration pass when files exceed review_multipass_threshold (default 8) |
| Task | Principle | Status | Evidence |
|---|---|---|---|
| 5.1 | Lost-in-the-middle effect mitigation | Implemented | Scout report: Key Facts first (beginning bias), Risks last (recency bias) |
| 5.1 | Trimming verbose tool output before context accumulation | Implemented | agent.rs:1505 truncate_for_preview trims tool output to 200 chars. Build/test output trimmed between builder and reviewer stages |
| 5.1 | Persistent structured state outside conversation history | Implemented | .buildloop/ files persist scout report, plan, claims, review across context boundaries |
| 5.2 | Escalation patterns (human-in-the-loop) | Implemented | Review mode pauses for approval. WIP commits + GitHub issues escalate failures |
| 5.2 | Escalation on inability to progress, not just complexity | Implemented | Verify failure after fixer retry = WIP + issue. Discovery backs off when nothing found |
| 5.3 | Structured error propagation across multi-agent systems | Implemented | context.rs:35 StageResult struct with failure_type, attempted_action, partial_results, suggestions. Fixer receives structured context |
| 5.3 | Distinguish access failures from valid empty results | Implemented | context.rs:13 FailureType enum: Timeout, Crash, GateFail, ReviewFail, RateLimited, StopRequested |
| 5.4 | Context degradation in extended sessions | Implemented | Each agent is a fresh session. Long sessions are architecturally impossible |
| 5.4 | Scratchpad files for persisting findings | Implemented | .buildloop/ artifacts are exactly this pattern |
| 5.4 | Crash recovery via structured state exports | Implemented | TASKS.md SPID progress + .buildloop/ artifacts survive crashes |
| 5.5 | Human review workflows and confidence calibration | Implemented | Review mode creates PRs with polling. review.rs:714 confidence scores (0.0-1.0) with config.rs:199 configurable threshold |
| 5.6 | Information provenance in multi-source synthesis | Implemented | review.rs:675 source_evidence field on every finding: snippet, line_range, reasoning chain. Fixer receives full provenance |
| 5.5 | Confidence scores for calibrated review routing | Implemented | review.rs:714-758 per-finding confidence (0.0-1.0). Below confidence_threshold (default 0.5) flagged for manual review, not auto-fixed |