Documentation
Runbook: Bad Deployment / Rollback
Project-Agent-trust-merge / docs/ops/runbooks/bad-deployment-rollback.md
- Severity:
SEV-1when a new deploy causes customer-visible outage; otherwiseSEV-2. - Page: release owner and on-call engineer.
- Triage:
- Check
/health/componentsand API/frontend health gates. - Inspect deployment-run blockers and latest rollout notes.
- Compare pre/post-deploy error-rate and latency deltas.
- Mitigation:
- Pause rollout activity immediately.
- Roll back the most recent deploy or deployment wave.
- Keep live cohort rollout disabled until SLOs stabilize.
- Customer comms:
- Status page update immediately for customer-visible regression.
- Follow with direct customer notice if approval, audit, or SSO paths were impacted.