Public documentation for governed AI labor
SDKs/Governance/Connectors
Arx / Docs / Off-Hours Alert Routing

Documentation

Off-Hours Alert Routing

Project-Agent / library/workflows/off-hours-routing/README.md

Project-Agent repo-root library/workflows/off-hours-routing/README.md

Intelligently routes PagerDuty alerts that arrive outside business hours by checking Splunk for severity context, then either routes to the on-call team immediately or defers low-priority alerts to the morning queue.

Maturity: L3+ (Enforced and up)  ยท  See the 5-level maturity model for where this workflow fits in your program.

Time Saved

Before: ~10 minutes per overnight alert manually triaging and deciding whether to wake on-call or defer to morning.

After: Automated severity-based routing. On-call is only paged for confirmed high-severity events; low-priority alerts queue for morning review.

Connectors

| Connector | Operations | Risk | |-----------|-----------|------| | PagerDuty | incidents:read, incidents:update | HIGH | | Splunk | search:execute | LOW | | Slack | chat:write | LOW |

Overall Risk: HIGH -- PagerDuty incidents:update modifies incident state (snooze, re-route). Requires HITL approval.

How It Works

  1. During off-hours, fetch new PagerDuty triggered incidents.
  2. Run a Splunk severity check for each alert to determine true urgency.
  3. High-severity alerts are immediately routed to the on-call responder.
  4. Low-severity alerts are deferred by snoozing until the morning window.
  5. Post a Slack summary of routing decisions.

ARX Governance

  • Risk Classification:
  • PagerDuty:incidents:read -- LOW -- read-only incident retrieval
  • PagerDuty:incidents:update -- HIGH -- modifies incident state (snooze, re-route, escalation)
  • Splunk:search:execute -- LOW -- read-only severity context lookup
  • Slack:chat:write -- LOW -- informational routing summaries
  • HITL Gate: Enabled -- all PagerDuty incidents:update operations require human approval. Snoozing and re-routing decisions are presented to the on-call lead for confirmation before execution.
  • Approval Channel: #ops-approvals
  • Policy Rules:
  • PERMITTED: Reading PagerDuty incidents and running Splunk severity checks
  • PERMITTED: Posting Slack routing summaries
  • ESCALATED (HITL required): Snoozing low-severity PagerDuty incidents until morning
  • ESCALATED (HITL required): Re-routing high-severity PagerDuty incidents to on-call
  • DENIED: Resolving or acknowledging PagerDuty incidents without human review
  • Audit Trail: Every incident evaluated, Splunk severity score, routing decision (route vs. defer), HITL approval status, and final PagerDuty action are logged with timestamps.
  • Config: See arx.yaml for connector permissions, HITL gate configuration, off-hours window, and approval channel.

Setup

Prerequisites

``bash pip install arx ``

Environment Variables

``bash export PAGERDUTY_API_KEY="your-pagerduty-api-key" export SPLUNK_URL="https://splunk.your-org.com:8089" export SPLUNK_TOKEN="your-splunk-bearer-token" export SLACK_BOT_TOKEN="xoxb-your-slack-token" export SLACK_OPS_CHANNEL="#ops-overnight" ``

Run

```bash

One-time execution

arx run workflow.py

Register on schedule (every 10 minutes during off-hours, 20:00-08:00 UTC)

arx register --config arx.yaml ```

Customization

  • Adjust off_hours_start, off_hours_end, and morning_snooze_until times
  • Configure severity thresholds for route-vs-defer decisions
  • Change the HITL approval channel in arx.yaml