ADR-003: Policy engine is Python rules + Redis store, not OPA/Rego | Enterprise AI Platform

Context

The platform must enforce dynamic, per-tenant policies on every request: which actions are allowed, which require approval, which must mask outputs. Policies change frequently — a tenant signs a new DPA, a new PII type appears, an incident requires temporarily disabling a class of action. The classic options:

Open Policy Agent (OPA) — a separate sidecar policy engine evaluating Rego rules; the gold standard for declarative policy.
Cedar (AWS's newer policy language) — similar shape, less mature ecosystem.
Python rule evaluator + Redis policy store — policies live in Redis as JSON, the application loads + evaluates them in-process.

Decision

Adopt the Python rule evaluator + Redis policy store. Policies are stored in Redis as JSON documents; the PolicyEnforcer loads them at request time and evaluates rule conditions (eq, in, gte) against the request context (tenant, user role, action, resource tags).

# governance/policy.py
class PolicyEnforcer:
    def evaluate(self, context: dict) -> PolicyDecision:
        for policy in self._load_policies(context["tenant_id"]):
            if self._matches(policy, context):
                return PolicyDecision(
                    action=policy["action"],  # allow / deny / require_approval / mask
                    reason=policy["reason"],
                )
        return PolicyDecision(action="allow", reason="default")

Tradeoffs we accept

Lever	OPA + Rego	Python + Redis (chosen)
Day-1 setup cost	Sidecar deploy + RBAC + RPC	Zero new infra (Redis already exists)
Formal verification	Rego is a real DSL — can be model-checked	Python rules are not formally verifiable
Rule expressiveness	Higher (set ops, recursion, derived facts)	Lower (eq / in / gte / range)
Hot-reload latency	OPA bundle pull (seconds)	Redis read on cache miss (milliseconds)
Audit trail	OPA decision logs (separate pipeline)	Built into application audit log
Operator skill required	Rego learning curve	Python — already known by every engineer
Compliance review story	"We use the de-facto policy DSL"	"Our policy is application code" — reviewer needs to read Python

We optimize for fewer moving parts in v1 and already-known team skills. Rego buys formal verification and ecosystem polish; we neither need verification at v1 scale nor have a Rego-fluent reviewer on the team. The reversal plan below shows the swap path when either becomes true.

Consequences (positive)

Policy authoring is plain JSON in Redis. A redis-cli SET is a policy hot-reload — no rebuild, no bundle pull.
The PolicyEnforcer is unit-testable as a pure Python function.
Policy decisions land in the same audit log as RAG queries — one pipeline to monitor, one runbook to maintain.
No new infrastructure to harden, monitor, or RBAC.

Consequences (negative)

No formal verification. A buggy rule that silently denies a class of legitimate requests can ship. Mitigation: red-team pytest suite in Module 03 covers known policy regressions.
Limited expressiveness. Rules cannot reference derived facts (e.g. "deny if more than 3 prior denies in the last hour"). A workaround using Redis counters is documented in governance/policy.py.
Compliance reviewer friction. A SOC 2 auditor expecting a Rego bundle will need to read Python rule logic instead. Mitigation: the Python evaluator is ~80 lines and the JSON rules are exportable as human-readable policies in the compliance report.
No dedicated decision log. OPA produces a structured decision log out of the box. We log decisions in the application audit log — same data, less polish.

Reversal plan

Replacement is bounded by the PolicyEnforcer interface:

Add governance/opa_client.py exposing the same evaluate() method.
Translate the existing JSON rules to Rego (mechanical — there are ~30 rules in v1).
Switch PolicyEnforcer.__init__ to dispatch to OPA via feature flag.
Run shadow comparison for a week (both engines evaluate; mismatch alerts).
Cut over.

Estimated effort: 2 engineer-weeks for a tested swap. Reversible.

The Cedar swap path is similar; a third engine could be added without changing the call sites.

References

apps/web/public/downloads/enterprise-ai-platform-starter.zip!/governance/policy.py
apps/web/public/downloads/enterprise-ai-platform-starter.zip!/audit/logger.py
ADR-004 (Redis approval queue — same Redis instance, separate keyspace)