Back to Agentic Data Pipeline
Cost Models·Agentic Data Pipeline
Agentic Data Pipeline cost model
Component-by-component baseline vs optimized cost breakdown for this EXPERT project. The source CSV ships in the project starter kit at docs/cost-model/ and is downloadable below.
| # agentic-data-pipeline cost model — v1 | |||||
|---|---|---|---|---|---|
| # Last updated: 2026-05-08 | |||||
| # | |||||
| # ASSUMED LOAD | |||||
| # - 5 | 000 agent runs / month (a mid-stage data team replacing rigid DAGs with adaptive agents) | ||||
| # - ~6 LLM calls per run on average (1 supervisor route + 4 workers + 1 retry/escalation) | |||||
| # - ~800 input + ~200 output tokens per LLM call | |||||
| # - ~30M total tokens / month (~24M input | ~6M output) | ||||
| # - AWS region us-east-1 | list prices as of 2026-05 | ||||
| # - LangSmith Plus tier on 30k traces/mo budget | |||||
| # | |||||
| component | sub | baseline_monthly_usd | optimized_monthly_usd | delta_usd | notes |
| Anthropic Claude Sonnet (planner + supervisor) | 100% baseline → 30% optimized · 24M in / 6M out tok/mo | 144.00 | 43.00 | 101.00 | Cascade: 70% routed to Haiku for worker calls; only supervisor + escalation reach Sonnet. Pricing: USD 3.00/M input + USD 15.00/M output. |
| Anthropic Claude Haiku (worker agents) | 70% of mix in optimized · ~17M in / 4M out tok/mo | 0.00 | 26.00 | -26.00 | New cost in optimized — Haiku replaces 70% of worker LLM calls. Pricing: USD 0.80/M input + USD 4.00/M output. |
| AWS RDS Postgres (db.t4g.medium) | 100GB gp3 · business data store (per ADR-002 split) | 50.00 | 35.00 | 15.00 | On-demand: ~USD 50/mo. 1-yr reserved: ~USD 35/mo (-30%). Storage: USD 0.10/GB-mo gp3. |
| AWS ElastiCache Redis (cache.t4g.small) | cache.t4g.small · checkpoint + recovery state (per ADR-002) | 35.00 | 26.00 | 9.00 | On-demand: ~USD 35/mo. 1-yr reserved: ~USD 26/mo (-26%). |
| LangSmith observability (Plus tier) | 30k traces/mo budget · agent + tool spans | 39.00 | 39.00 | 0.00 | Pricing: Plus tier flat USD 39/mo · 30k traces included. No optimization lever at this scale. |
| GitHub Actions + container registry | ~150 PR runs/mo × 6 min × Linux runners + GHCR storage | 12.00 | 12.00 | 0.00 | Pricing: USD 0.008 / Linux runner-minute + free GHCR for public images. No optimization lever. |
| Total · 5k runs/mo | ~USD 0.056 per run at baseline · ~USD 0.036 optimized | 280.00 | 181.00 | 99.00 | −USD 99/mo · −35% — judge cascade carries most of the savings; reserved instances on RDS+EC the rest. |
| # | |||||
| # OPTIMIZATION LEVERS | |||||
| lever | description | impact | |||
| Model cascade (Haiku for workers · Sonnet for supervisor) | Route 70% of worker LLM calls to Haiku (USD 0.80/M in). Supervisor and escalation paths stay on Sonnet for routing quality. ADR-001 + ADR-003. | −USD 75 / mo · −27% | |||
| Idempotent tool-call cache | SHA-256 cache on (tool_name, args) for read-only tools (query_database, validation, file_processor). Redis-backed, 1h TTL on quality-sensitive tools, 24h on static reads. ~18% hit rate on regression suites. | −USD 14 / mo · grows with workload stability | |||
| RDS + ElastiCache 1-yr reserved | Commit to 12-month reserved capacity once load is stable for 30 days. ~30% off RDS, ~26% off ElastiCache. Break-even at month 4. | −USD 24 / mo · −28% on store cost | |||
| LangGraph checkpoint compression | gzip checkpoint values before write to Redis. ~60% size reduction on agent state. Lets us drop ElastiCache to cache.t4g.micro at half the cost. Trade: ~5ms compression overhead per checkpoint. | −USD 18 / mo if cache size becomes the bottleneck | |||
| # | |||||
| # WHEN THIS COST MODEL BREAKS | |||||
| # 1. Run rate crosses ~50k/mo → orchestrator overhead becomes the bottleneck. | |||||
| # Trigger to revisit ADR-001 (LangGraph vs custom orchestrator). | |||||
| # 2. Adversarial / retry fraction > 30% → escalation path dominates and cascade savings collapse. | |||||
| # Mitigation: tighten the M05 ToolCallGuard budget per run. | |||||
| # 3. Postgres storage > 200GB → upgrade db.t4g.medium → db.t4g.large (~USD 95/mo) or move analytics to ClickHouse. | |||||
| # 4. Tool calls / run > 30 → run cost dominates. Cap via ContextWindowManager (M05) or split into multiple runs. | |||||
| # | |||||
| # SOURCES | |||||
| # - Anthropic API pricing: https://www.anthropic.com/pricing (verified 2026-05-08) | |||||
| # - AWS RDS PostgreSQL pricing: https://aws.amazon.com/rds/postgresql/pricing/ (us-east-1 | on-demand + 1-yr reserved) | ||||
| # - AWS ElastiCache pricing: https://aws.amazon.com/elasticache/pricing/ (us-east-1 | cache.t4g.small) | ||||
| # - LangSmith pricing: https://smith.langchain.com/pricing (Plus tier) | |||||
| # - GitHub Actions pricing: https://github.com/pricing (Linux runners | private repos) | ||||
| # - Cross-checked against agent-platform-sla.yaml (the SLA + cost YAML shipped in the starter kit) |
Cost model in context
This cost model ships with Agentic Data Pipeline — including the 5 ADRs that produced the optimization deltas.
Open project →