Public documentation for governed AI labor
SDKs/Governance/Connectors
Arx / Docs / 05 — 18-month build sequence

Documentation

05 — 18-month build sequence

Project-Agent / control-plane-evaluation/05-build-sequence.md

Project-Agent repo-root control-plane-evaluation/05-build-sequence.md

Quarter-by-quarter sequence with team sizing constraints, customer-shippable increments per quarter, the first paying customer in production by end of Q3, compliance-evidence demoable to a CISO buyer by end of Q4, and multi-platform integration depth dominating Q5–Q6.

Headcount ramp (engineering only; PM/design/SRE/security additional):

| Q | Eng headcount | New hires this Q | Mix | |---|---|---|---| | Q1 | 6 | (founding) | 4 backend, 1 SRE, 1 founding designer-engineer | | Q2 | 8 | +2 | +1 backend (identity), +1 integrations | | Q3 | 10 | +2 | +1 backend (compliance/audit), +1 SRE | | Q4 | 12 | +2 | +1 integrations, +1 frontend/product | | Q5 | 13 | +1 | +1 integrations | | Q6 | 14 | +1 | +1 integrations |

Specialist hires not in this count: Security Eng (Q3), AI/ML Eng for drift + behavioral models (Q4), Compliance program manager (Q3 — non-eng), Customer Eng / Solutions (Q3+).

---

Gantt — what gets built when

`` Q1 Q2 Q3 Q4 Q5 Q6 ────────────────────────────────────────────────────────────────────────── PDP (Cedar) bundle ███████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Identity (SPIRE fork) ░░░░░░░██████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Credential broker ░░░░░░░░░░░░░░░░██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Owner-binding registry ░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Audit chain (Merkle+TSA)░░░░░░░░░░░░░░░░░██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Customer S3 anchor ░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ arx-verify CLI ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Evidence emitter ░░░░░░░░░░░░░░░░░░░░░░░░░░██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ SOC 2 + ISO 42001 maps ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ NIST + EU AI Act maps ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░░░ OWASP NHI/Agentic maps ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░ Vendor questionnaire UI ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░ Discovery broker (frame)░░░░░░░░░░██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ eBPF shadow sensor ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░ DRL (Redis + bloom) ░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Kill-switch saga (Temp) ░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Atomic terminate UI ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ OTLP receiver ░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ClickHouse spans store ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Cost attribution ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░ Trace↔audit correlation ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░ PEP — SDK (LangChain) ░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ PEP — SDK (CrewAI) ░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ PEP — SDK (AutoGen) ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░ PEP — SDK (OpenAI Ag) ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░ PEP — MCP gateway ░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ PEP — Envoy sidecar ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░ PEP — egress proxy ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░ ServiceNow native hook ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░ UiPath activity ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░ Salesforce Apex package ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░ Bedrock action-group ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░ Foundry connector tier ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░ Gemini connector tier ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░ watsonx skill wrapper ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░ Engagement Canvas v1 ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Engagement Canvas v2 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Board View live wired ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░░░░░░░░░░ Onboarding Concierge ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Single-tenant cell SaaS ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░░░░░░░░ Customer-VPC playbook ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████ SOC 2 Type II audit ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████ Customer #1 (paying) ░░░░░░░░░░░░░░░░░░░░░░░░██████████████████████████████████████████████████ Customer #2 (paying) ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████████████████████████████ Customers #3–#5 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████████░ ────────────────────────────────────────────────────────────────────────── Q1 Q2 Q3 Q4 Q5 Q6 ``

---

Q1 — "Govern one open-framework agent end-to-end on shared SaaS"

Customer-shippable increment: A LangChain agent owned by a single customer's engineer is registered with ARX, gets a SPIFFE SVID at deploy time, makes tool calls intercepted by the in-process PEP, every action evaluated by the Cedar PDP against a customer-authored policy bundle, every action emits an enforcement event into the audit log, and the customer can hit a kill switch from the UI to terminate that agent within 30 seconds.

Build:

  • C3.1 PDP: Cedar evaluator embedded in a Python wheel (arx-pdp); bundle hot-reload from S3 (30s poll); decision logging to local file → flush to control plane
  • C3.3 policy authoring API + CLI (arx policy push)
  • C3.2 in-process PEP for LangChain (arx-langchain package): callback-handler wraps tool calls + LLM calls
  • C2.1 stub identity issuer: SPIRE fork, single tenant, JWT-SVID for the LangChain agent at startup via bootstrap exchange
  • C1.3 inventory store on Postgres + LISTEN/NOTIFY for change feed (no Kafka yet)
  • Engagement Canvas v1 (already shipped — keep and use)

Team allocation (6 eng): 2 on PDP + bundle distribution, 1 on identity (SPIRE fork), 1 on LangChain SDK PEP, 1 on inventory + control-plane API, 1 SRE on shared SaaS infra (one region, us-east).

Exit criteria for Q1:

  • A demo customer agent runs end-to-end: SVID issued, tool call intercepted, Cedar policy evaluates, decision logged
  • Latency: PDP evaluation p99 < 10ms (relaxed for v1 — tighten in Q2)
  • 1 design partner in dev environment using the demo path

Risks: identity bootstrap is harder than it looks. SPIRE customization is non-trivial.

---

Q2 — "Govern two more open-framework runtimes; ship audit chain v1 to customer S3"

Customer-shippable increment: Same as Q1 but works for CrewAI and any MCP-speaking agent. Audit events now batched into Merkle blocks, anchored with RFC 3161 timestamps, written to a customer-controlled S3 bucket. The customer can run arx-verify and validate chain integrity from their own infrastructure.

Build:

  • C3.2 in-process PEP for CrewAI
  • C3.2 MCP gateway PEP (forking Anthropic MCP reference SDK)
  • C2.2 credential broker first version (AWS STS + generic OAuth client_credentials only)
  • C4.2 audit chain (Merkle + RFC 3161 + customer S3 anchor)
  • arx-verify CLI shipped via PyPI
  • C6.2 DRL on Redis Cluster + bloom filter snapshot in bundle
  • C1.1 discovery broker framework + clients for AWS Bedrock list-agents and Salesforce Tooling API (read-only enumeration; no enforcement yet)

Team allocation (8 eng): 2 on PEPs (CrewAI + MCP gateway), 1 on identity (continuing), 1 on credential broker, 1 on audit chain + arx-verify, 1 on DRL + kill-switch foundations, 1 on discovery, 1 SRE.

Exit criteria for Q2:

  • Customer agents on LangChain, CrewAI, and any MCP host all governed end-to-end
  • Audit chain anchored to customer S3 with passing arx-verify validation
  • 3 design partners in dev environment
  • Latency: PDP p99 < 5ms (target met)

Risks: customer-controlled S3 cross-account flow has many failure modes. RFC 3161 TSA selection has compliance implications (some customers want a CA-rooted TSA — research who they are early).

---

Q3 — "First paying customer in production — open-framework + ServiceNow"

Customer-shippable increment: Customer #1 (a F1000 with significant LangChain/CrewAI footprint + ServiceNow AI Agents adoption) signs the first paid contract. The platform governs ~500 of their agents. Atomic kill switch propagates across both surfaces in < 60s. Owner-binding tied to the customer's Okta SCIM stream so departures auto-orphan agents.

Build:

  • C2.3 owner-binding registry + SCIM-driven orphan detection
  • C6.1 kill-switch orchestrator on Temporal (saga across in-process PEP + ServiceNow REST + DRL update)
  • C3.2 native ServiceNow Flow Designer custom action — first commercial-platform PEP shipped
  • C3.2 in-process PEP for OpenAI Agents SDK
  • Atomic-terminate UI (replaces flag-flip with the full ceremony)
  • C4.1 evidence emitter v1 with SOC 2 + ISO 42001 mapping (~30 controls each, source-line-attribution + release-SHA pinning)
  • Single-tenant cell deployment playbook (for customers that need it; not the first customer)
  • SOC 2 Type II audit kicked off (the AICPA process, ~6 months — must start now to land the report by Q5)

Team allocation (10 eng): 2 on commercial-platform PEPs (ServiceNow + Salesforce groundwork), 1 on identity / owner-binding, 1 on kill-switch / Temporal, 1 on evidence emitter + framework mapping, 1 on PEP polish (OpenAI Agents SDK, MCP), 1 on discovery (Microsoft Graph + Bedrock add), 2 SRE (one on multi-region prep, one on observability), 1 frontend/product on terminate UI.

Exit criteria for Q3:

  • Customer #1 in production paying $250K+ ARR
  • ~500 agents governed end-to-end including ServiceNow AI Agents
  • Kill switch demonstrably propagates across SDK, MCP, and ServiceNow within 60s
  • Evidence emitter producing SOC 2 + ISO 42001-tagged events
  • 5 design partners total, 2 of them in production-trial paid-pilot

Risks: ServiceNow ISV-app review timeline is the critical path. Start in Q1 to land in Q3. Salesforce parallel review starts in Q2.

---

Q4 — "CISO-grade compliance evidence — quarterly board package"

Customer-shippable increment: Customer #1's Q4 CISO board update includes an ARX-generated quarterly compliance package — auditor-verifiable Merkle-anchored evidence covering SOC 2 + ISO 42001 + NIST AI RMF + EU AI Act articles, with source-line attribution back to ARX release SHAs. CISO reviews at the board. Customer #2 closes (a F500 in financial services, drawn by the evidence story).

Build:

  • C4.1 framework mappings expanded: NIST AI RMF 1.0 + AI 600-1 GenAI profile, EU AI Act articles 9–17 + 26
  • C4.3 evidence package builder + UI
  • C4.4 vendor questionnaire renderer (SIG / CAIQ / HECVAT) — partner integration with Vanta (read-from-Vanta workflow; we provide the data, they provide the workflow)
  • Board View: live data binding (was demo-only in Q1); quarterly board package generator
  • C5.1 OTLP receiver + ClickHouse spans store production-ready
  • C5.4 drift detector ported from existing arxsec-api/app/core/drift_detector.py — already real, just needs integration
  • Engagement Canvas v2 — wires to live deployment data, plan-vs-actual on the Board View
  • C3.2 Envoy sidecar PEP (for customers running Python agents in their own k8s)
  • C1.2 eBPF shadow-agent sensor v1 (Tetragon fork) — first customer can install in their k8s

Team allocation (12 eng): 2 on framework mappings + evidence builder + Board View, 1 on Vanta partnership / questionnaire renderer, 2 on observability (OTLP + ClickHouse), 1 on drift, 1 on Envoy sidecar, 1 on eBPF sensor (new hire), 1 on identity polish, 1 on PEP for OpenAI Agents SDK + AutoGen, 2 SRE.

Exit criteria for Q4:

  • Customer #1 board demo of evidence package — CISO signs off as "audit-ready"
  • Customer #2 closed at $400K+ ARR (financial services)
  • Trace ingest at 10K spans/sec sustained
  • Drift detector running across all customer agents
  • Vanta partnership signed, joint customer in pilot
  • 8 design partners total

Risks: the evidence-package quality is the gate. If a CISO at customer #1 looks at the package and says "this isn't what my auditor wants" — that's the existential bug for the company. Mitigation: 4 customer audit-team interviews in Q3 to validate the package format BEFORE building it.

---

Q5 — "Multi-platform integration depth — Salesforce + Bedrock + UiPath"

Customer-shippable increment: Salesforce Agentforce, AWS Bedrock AgentCore, and UiPath now governable end-to-end. Customers #3, #4, #5 onboard; one of them is a tech-vertical customer using six of the seven commercial platforms simultaneously and proves the cross-platform value.

Build:

  • C3.2 native Salesforce Apex managed package (AppExchange-listed by end of Q5; ISV review started Q2)
  • C3.2 native UiPath activity package (UiPath Marketplace-listed by end of Q5)
  • C3.2 native AWS Bedrock action-group Lambda layer + integration
  • C3.2 in-process PEP for AutoGen
  • C5.2 cost attribution engine (per-action rollup; $ per agent per day; integrated with Board View)
  • Single-tenant SaaS cell deployment shipping (customer #2 or one of #3-#5 chooses cell)
  • Onboarding Concierge agent (mirrors P0-6b in the original plan; consultant-channel-friendly)
  • Engagement Canvas v2 → Provision bridge (engagement plan converts to actual cohort deployments)
  • SOC 2 Type II audit completes; report available

Team allocation (13 eng): 3 on commercial-platform PEPs (Salesforce, Bedrock, UiPath in parallel), 1 on AutoGen SDK PEP, 1 on cost attribution + workforce dashboards, 1 on Onboarding Concierge agent, 1 on Engagement → Provision bridge, 2 SRE (one on multi-region SaaS, one on cell deployments), 1 on identity polish + scale-test, 2 SRE / platform.

Exit criteria for Q5:

  • 3 commercial-platform PEPs shipped (Salesforce, Bedrock, UiPath) + 1 in-flight (ServiceNow already shipped, watsonx & Foundry & Gemini Q6)
  • Customers #3–#5 in production
  • $2M+ ARR
  • SOC 2 Type II report obtained
  • 10K agents under governance across the customer base

Risks: Salesforce ISV review timeline can slip — if the Apex package isn't AppExchange-listed by end of Q5, the customer-facing claim of "Salesforce Agentforce governance" is degraded to "via egress proxy only." Recovery: ship the egress-proxy fallback as the always-on path even when the native PEP is live; the native one is just the better experience.

---

Q6 — "Foundry + Gemini + watsonx + customer-VPC v1 — close the multi-platform story"

Customer-shippable increment: Microsoft Foundry / Copilot Studio, Google Gemini Enterprise Agent Platform, and IBM watsonx Orchestrate now governable. Customer-VPC deployment available for the highly-regulated. The 11-target story is fully shippable.

Build:

  • C3.2 Microsoft Foundry connector tier (Power Platform custom connector + Graph webhooks + egress proxy); Azure Marketplace listing
  • C3.2 Google Gemini connector tier (Agent Builder function-call wrapping + egress proxy); Cloud Marketplace listing
  • C3.2 IBM watsonx skill-wrapper integration; IBM partner certification
  • C3.2 egress proxy hardening (the catch-all for platforms where in-process is impossible)
  • C5.3 unified trace ↔ audit ↔ enforcement correlation API
  • Customer-VPC deployment playbook + first deployment
  • Vendor questionnaire UI v1
  • C4.1 OWASP NHI Top 10 + OWASP Agentic AI mapping completed
  • C1.2 eBPF shadow-agent sensor production-ready

Team allocation (14 eng): 4 on commercial-platform PEPs (Foundry + Gemini + watsonx + egress proxy hardening), 1 on observability correlation API, 1 on customer-VPC deployment, 1 on framework-mapping completion, 1 on questionnaire UI, 1 on Onboarding Concierge polish, 2 SRE, 3 on quality (security-eng + drift + identity scale-test).

Exit criteria for Q6:

  • All 11 targets governable to v1 depth
  • Customer-VPC deployment with 1 customer
  • $4–6M ARR
  • 25K agents under governance
  • Public SOC 2 Type II report + ISO 42001 attestation in progress
  • Two of the seven commercial platforms have certified partner status (likely ServiceNow + AWS first; Salesforce + UiPath close behind)

Risks: Foundry, Gemini, and watsonx are the hardest commercial integrations and are stacked in the same quarter. Slip risk is real. Mitigation: prioritize Foundry (largest TAM), then Gemini, then watsonx. If watsonx slips to Q7, that's acceptable given its smaller install base.

---

Quarter most likely to slip: Q5

Why: Q5 stacks three new commercial-platform integrations (Salesforce + Bedrock + UiPath) in the same quarter, with their respective ISV-review timelines as the critical path. Salesforce review historically takes 8-14 weeks the first time. Even with a Q2 start, the AppExchange listing landing in Q5 is tight.

The leading indicator: if by end of Q3 the Salesforce ISV review hasn't moved past the security-review stage, Q5 ships only Bedrock + UiPath + ServiceNow polish, and Salesforce slides to Q6.

Recovery plan if Q5 slips:

  1. Ship the egress-proxy Salesforce coverage in Q5 instead of the native Apex package. Limited fidelity but real coverage.
  2. Pull Foundry's egress-proxy work forward from Q6 to Q5 to keep the multi-platform claim honest.
  3. Communicate to customers: "Salesforce coverage is via egress proxy in Q5, native Apex in Q6." Don't pretend the native is shipping when it isn't.

---

Honest pushback (per the workflow)

The plan above is operationally tight. Two things will not actually go this smoothly:

  1. Customer #1 will demand customizations that pull engineering off the platform roadmap. The first paying customer always does this. Plan: explicitly carve out 20% of Q3 + Q4 capacity for first-customer-specific work that is NOT in the roadmap. If the carve-out doesn't get used, great. If it gets used, the roadmap doesn't slip.
  1. The platform PEP integrations (Salesforce, Foundry, Bedrock, UiPath, watsonx, ServiceNow, Gemini) are seven separate engineering efforts with seven separate partner relationships with seven separate certification timelines. This sequence assumes one integration engineer per platform. In practice, an engineer effective on Salesforce Apex is not the same engineer effective on AWS Lambda or watsonx skills. The realistic team shape is 1.5 engineers per platform integration in steady state. Either ramp engineering faster or ship fewer platforms in Q5/Q6.

The compromise: ship 5 of 7 commercial platforms by Q6, defer the bottom 2 (likely watsonx + Gemini, in some order, depending on customer pull) to Q7. Don't pretend all 7 land by month 18 if the staffing doesn't support it.

The other slip risk worth naming: the SOC 2 Type II audit timeline. Six months minimum from kicking off the audit period to issuing the report. If the audit doesn't kick off in Q3, the report doesn't land in Q5, and selling into financial-services / healthcare in Q5 onwards is materially harder. Kick the audit off no later than month 7.