← Context Foundry

Foundry Execution Backends

Architecture decision and Azure Container Apps design for Context Foundry build execution and scale-to-zero preview hosting.

Decision: Azure Container Apps Status: Planning 2026-05-24 C4 + Azure topology

1 Context

Knowmler Lab turns a promoted idea into a working app by calling Context Foundry. Today the whole lifecycle (run the AI build, build the image, host the preview) runs on the main VPS through Foundry's local_docker backend. That has three problems:

Blast radius

AI-generated apps run as containers on the same host as Knowmler production and the trading bots. Untrusted code next to real money.

Contention & fragility

Builds spike CPU on the production box (the exit-125 incident), and the preview path is brittle (the foundry-caddy auto-HTTPS 502 and the edge-network gap).

No elasticity

Each preview is a long-lived container. Previews do not scale to zero and accumulate on the box.

The contract that stays fixed

Knowmler submits builds via POST /v1/jobs and polls. That interface does not change; only the execution target behind it does.

2 Decision

D1 — Backends live inside Foundry, never inside Knowmler

Knowmler's interface stays as it is: POST /v1/jobs + poll. The execution target (local Docker, Azure, a future GitHub or Cloudflare backend) is a BuildBackend selected by Foundry, configured by environment, swappable without touching Knowmler. This preserves the queue, idempotency, per-job logs, artifacts, cancel/TTL, cleanup, and the scoped-token auth model that Foundry's /v1 service already provides.

D2 — Adopt the in-tree azure_container_apps backend

Run builds as Azure Container Apps Jobs and previews as scale-to-zero Azure Container Apps, with images in Azure Container Registry and job objects in Azure Blob Storage. Chosen because it is the only complete, already-written off-VPS backend: it covers build and preview, scales to zero, isolates untrusted apps, and aligns with Foundry's existing abstraction. The cost is onboarding Azure as a vendor.

3 Alternatives considered

OptionVerdictWhy
Stay on main VPS (local_docker)Rejected Untrusted apps beside trading money; contention; preview fragility.
Dedicated preview VPSFallback Cheap isolation, zero new code, but self-managed and not elastic.
GitHub build + Cloudflare ContainersDeferred Cleanest single-vendor stack, but no backend exists: new BuildBackend + routing Worker. Most engineering, no scale problem yet.
Knowmler -> GitHub workflow_dispatchRejected Bypasses the entire Foundry control plane.

4 C4 architecture

Level 1 — System Context

C4Context
    title System Context - Knowmler Lab builds via Context Foundry on Azure
    Person(admin, "Lab Admin", "Promotes an idea and clicks Build")
    Person_Ext(viewer, "Preview Viewer", "Opens the built app")
    System(knowmler, "Knowmler", "Lab platform. Submits builds and polls status.")
    System(foundry, "Context Foundry Build Service", "/v1 control plane: queue, idempotency, logs, artifacts, TTL, auth proxy")
    System_Ext(azure, "Microsoft Azure", "Build compute and preview hosting")
    System_Ext(anthropic, "Anthropic Claude", "LLM, via Foundry scoped auth proxy")
    Rel(admin, knowmler, "Clicks Build")
    Rel(knowmler, foundry, "POST /v1/jobs; polls", "HTTPS + bearer")
    Rel(foundry, azure, "Runs build job, builds image, deploys preview", "ARM API")
    Rel(foundry, anthropic, "Per-job scoped proxy token", "HTTPS")
    Rel(viewer, azure, "Opens app.org.knowmler.com", "HTTPS")
    UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
  

Level 2 — Container Diagram

C4Container
    title Container Diagram - azure_container_apps backend
    Person(admin, "Lab Admin")
    System_Boundary(kn, "Knowmler (VPS)") {
        Container(fe, "Frontend", "Next.js", "Lab UI and Build button")
        Container(be, "Backend + Worker", "FastAPI", "Calls /v1/jobs, polls")
    }
    System_Boundary(az, "Azure Resource Group: rg-foundry-prod") {
        Container(daemon, "foundry serve", "Container App (Rust)", "The /v1 control plane")
        Container(job, "Build Job", "ACA Job (foundry-builder)", "QRPBA, 2 vCPU / 4 GiB, scoped token")
        Container(preview, "Preview App", "Container App, scale-to-zero", "The built application")
        Container(acr, "Container Registry", "ACR", "foundry-builder + per-app images")
        ContainerDb(blob, "Job Storage", "Blob foundry-jobs", "inputs, logs, artifacts")
        ContainerDb(logs, "Log Analytics", "Workspace", "ACA environment logs")
    }
    System_Ext(anthropic, "Anthropic Claude")
    Rel(admin, fe, "Build")
    Rel(fe, be, "POST /api/lab/ideas/{id}/build")
    Rel(be, daemon, "POST /v1/jobs; poll", "HTTPS")
    Rel(daemon, job, "Create + start ACA Job", "ARM")
    Rel(job, blob, "I/O via SAS")
    Rel(job, anthropic, "scoped proxy token")
    Rel(daemon, acr, "Build app image (ACR Tasks)")
    Rel(daemon, preview, "Deploy from image", "ARM")
    Rel(preview, acr, "Pull image")
    Rel(daemon, logs, "logs")
    UpdateLayoutConfig($c4ShapeInRow="2", $c4BoundaryInRow="1")
  

5 Azure deployment design

Deployment topology

flowchart TB
    classDef az fill:#e6f2fb,stroke:#0078D4,stroke-width:1.5px,color:#243a5e;
    classDef edge fill:#f3f9fd,stroke:#5b6b7f,stroke-width:1px,color:#243a5e;
    VIEWER["Preview Viewer"]:::edge
    CF["Cloudflare DNS + TLS
*.org.knowmler.com"]:::edge BE["Knowmler Backend + Worker
POST /v1/jobs, poll"]:::edge subgraph RG["Azure Resource Group: rg-foundry-prod"] direction TB SVC["Container App: ca-foundry-service
foundry serve (/v1)"]:::az ENV["Container Apps Env: cae-foundry"]:::az JOB["ACA Job: caj-foundry-build
2 vCPU / 4 GiB"]:::az PREV["Container Apps: ca-preview-*
scale-to-zero"]:::az ACR["Container Registry: crfoundry"]:::az ST["Storage + Blob
stfoundryjobs / foundry-jobs"]:::az LOG["Log Analytics: log-foundry"]:::az ID["Managed Identity: id-foundry"]:::az end BE -->|/v1 HTTPS + bearer| SVC SVC -->|create/start ARM| JOB SVC -->|ACR Tasks build| ACR SVC -->|deploy ARM| PREV JOB -->|SAS I/O| ST JOB -->|pull builder| ACR PREV -->|pull app image| ACR SVC --- ENV JOB --- ENV PREV --- ENV ENV --> LOG SVC -. uses .-> ID VIEWER --> CF -->|CNAME| PREV

Resource inventory

Resource Group
rg-foundry-prod
One blast-radius boundary for all Foundry infra.
Container Apps Env
cae-foundry
Hosts daemon, build jobs, and previews.
Container App (daemon)
ca-foundry-service
foundry serve, the /v1 control plane.
Container Apps Job
caj-foundry-build
Per-build foundry-builder, 2 vCPU / 4 GiB.
Container Apps (preview)
ca-preview-<jobid>
One scale-to-zero app per preview.
Container Registry
crfoundry
Builder + per-app images. Basic SKU.
Storage + Blob
stfoundryjobs
Inputs, logs (append blob), artifacts.
Log Analytics
log-foundry
Backs the ACA environment.
Managed Identity
id-foundry
Daemon + jobs identity for ACR/storage/ARM.

Identity and least-privilege roles

PrincipalRoleScopeWhy
ca-foundry-service MIContributor (or custom: Microsoft.App/* + ACR scheduleRun)RGCreate/start jobs, build images, deploy/delete previews.
ca-foundry-service MIStorage Blob Data ContributorStorage acctRead/write job objects.
ca-foundry-service MIAcrPushACRPush built app images.
Job + preview MIAcrPullACRPull builder + app images.

6 Build & preview lifecycle

sequenceDiagram
    autonumber
    participant K as Knowmler worker
    participant F as foundry serve
    participant J as ACA Job (builder)
    participant R as ACR
    participant P as Preview App
    K->>F: POST /v1/jobs (SPEC.md, TASKS.md, org_slug)
    F->>F: enqueue, idempotency, SAS grant + scoped proxy token
    F->>J: create + start ACA Job
    J->>J: QRPBA build (LLM via proxy token)
    J-->>F: stream.jsonl via append blob
    F->>R: ACR Tasks build app image, push
    F->>P: deploy scale-to-zero Container App
    P-->>F: FQDN
    F-->>K: status ready + preview_url
    Note over F,P: TTL reaper + teardown DELETE Job, Container App, ACR repo
  

7 Service configuration

Read only when built --features azure and FOUNDRY_SERVICE_BUILD_BACKEND=azure_container_apps. The first seven are required.

VariableRequiredExample
FOUNDRY_SERVICE_AZURE_SUBSCRIPTION_IDyes<sub-guid>
FOUNDRY_SERVICE_AZURE_RESOURCE_GROUPyesrg-foundry-prod
FOUNDRY_SERVICE_AZURE_LOCATIONyeseastus2
FOUNDRY_SERVICE_AZURE_STORAGE_ACCOUNTyesstfoundryjobs
FOUNDRY_SERVICE_AZURE_STORAGE_KEYyes<key, prefer Key Vault>
FOUNDRY_SERVICE_AZURE_ACR_NAMEyescrfoundry
FOUNDRY_SERVICE_AZURE_ACA_ENVIRONMENTyescae-foundry
FOUNDRY_SERVICE_AZURE_STORAGE_CONTAINERno (foundry-jobs)foundry-jobs
FOUNDRY_SERVICE_AZURE_MI_CLIENT_IDno (system-assigned)<mi-guid>
FOUNDRY_SERVICE_AZURE_SIGNED_URL_TTL_SECSno (900)900
FOUNDRY_SERVICE_AZURE_SAS_GRANT_TTL_SECSno (3600)3600
Custom domains: previews resolve at <app>.<org-slug>.knowmler.com. Cloudflare holds DNS + edge TLS; a CNAME points the per-org wildcard at the ACA environment, with the preview bound as an ACA custom domain. The org-slug is the owning org's slug (fixed in commit 185cee5), capped at 63 chars.

8 Cost model

List price, region-dependent.

9 Security posture

Scoped tokens, not the OAuth key

Builds get a per-job scoped proxy token, never the raw Claude OAuth credential. The auth proxy revokes it when the job ends or is canceled.

Short-TTL grants

Job I/O uses short-TTL SAS; artifact downloads use short-TTL signed URLs.

Isolation

Untrusted generated apps run in their own Container Apps, away from Knowmler production and the trading host.

Secret storage

Prefer Key Vault for the storage key and FOUNDRY_SERVICE_API_KEYS rather than plaintext env.

10 Implementation steps

  1. Provision rg-foundry-prod: Log Analytics, ACA environment, Storage + foundry-jobs container, ACR, managed identity, role assignments.
  2. Build Foundry --features azure; build and push foundry-builder to crfoundry (in a Rust container or CI; the host has no Rust toolchain).
  3. Deploy ca-foundry-service with the FOUNDRY_SERVICE_AZURE_* env and FOUNDRY_SERVICE_BUILD_BACKEND=azure_container_apps.
  4. Point Knowmler's FOUNDRY_API_URL at the Azure daemon; rotate FOUNDRY_SERVICE_API_KEYS per the runbook.
  5. Wire the per-org preview domain (Cloudflare CNAME + ACA custom domain). org_slug routing is already fixed.
  6. Canary: run one build end-to-end; confirm preview and teardown.
  7. Cutover: switch Knowmler builds to the Azure daemon; keep the VPS path as a documented rollback.

11 Risks & open questions