Context
When a policy engine returns require_approval, the request is parked
until a human reviewer accepts or rejects it. We need durability
(restarts must not lose pending approvals), a clear TTL (stale
approvals must auto-reject), and per-tenant scoping. The classic
options:
- Temporal workflow — a durable workflow per pending approval; timeouts, signals, and retries are first-class.
- AWS Step Functions / Argo Workflows — similar shape, more vendor coupling.
- Redis with TTL + a polling worker — store pending approvals as
keys with a 24h TTL; reviewers
GET+SETaccept/reject.
Decision
Adopt Redis with 24h TTL for v1.
# api/approvals.py
KEY = f"approval:{tenant_id}:{request_id}"
redis.setex(KEY, 86400, json.dumps({
"status": "pending",
"agent_action": action,
"context": context,
"submitted_at": now(),
}))
# reviewer endpoint mutates status; agent_executor checks status
This decision is Accepted with known replacement conditions — see §Reversal. The expectation is that Temporal becomes the right answer once any of these triggers fire:
- Approval workflows exceed 24h (e.g. weekend approvals, multi-step reviewer chains)
- Workflows need compensating actions on rejection (rollback DB changes, revoke tokens, notify downstream systems)
- Approvals need fan-out (multiple required approvers, quorum logic)
- Cross-region durability becomes a requirement
Tradeoffs we accept
| Lever | Temporal | Redis + TTL (chosen) |
|---|---|---|
| Day-1 cost | Temporal cluster + UI + RPC | Zero new infra |
| Durability across long timeouts | Native (months OK) | 24h hard cap |
| Compensating actions on reject | First-class workflow primitive | Application code |
| Multi-step approval chains | First-class | Application code (clunky) |
| Audit trail | Workflow event history | Application audit log + Redis TTL log |
| Visibility (UI for in-flight workflows) | Built-in Temporal UI | Build it ourselves |
| Operator skill required | Workflow programming model | Redis already known by every engineer |
| At-most-once vs at-least-once delivery | Configurable | Application-level semantics |
Consequences (positive)
- v1 ships in <100 lines of FastAPI + Redis code.
- The 24h TTL is itself a feature — stale approvals auto-reject, so the "ghost approvals" problem (a reviewer who left for vacation blocking a request indefinitely) cannot happen.
- Audit trail uses the same logger as everything else.
- Reviewers and agents see the same Redis state — no cross-system consistency questions.
Consequences (negative)
- 24h hard cap is real. A weekend approval submitted Friday evening will auto-reject before Monday morning. Mitigation: per-tenant override TTL is configurable; some tenants get 72h.
- No compensating actions. A rejected agent action that already pre-fetched data does not roll back the side effects automatically. Mitigation: agent actions in v1 are read-only; mutating actions are out of scope until ADR-???-future.
- No multi-approver workflows. First reviewer wins. Mitigation: acceptable at v1 — mutation-class actions don't ship.
- No visibility UI. Operators inspect via
redis-cli KEYSplus the FastAPI health endpoint. Mitigation: Module 03 Grafana panel shows pending-approval count + age histogram per tenant.
Reversal plan
The replacement is well-scoped because the application contract is small:
- Implement
temporal_approvals.pywithsubmit(),accept(),reject(), andwait_for_decision()matching the current Redis interface. - Stand up Temporal cluster (managed Temporal Cloud is the path of least resistance — ~$0.30/k actions).
- Switch the approval worker via feature flag.
- Migrate in-flight Redis approvals: walk
approval:*keys, replay into Temporal as workflow signals. - Cut over after a 1-week soak period.
Estimated effort: 3-4 engineer-weeks including Temporal infra setup. Reversible — both engines can run in parallel during the soak.
References
apps/web/public/downloads/enterprise-ai-platform-starter.zip!/api/approvals.pyapps/web/public/downloads/enterprise-ai-platform-starter.zip!/governance/agent_executor.py- ADR-003 (Redis policy store — uses the same Redis instance, separate keyspace)