How to Reduce Data Warehouse Costs: Step-by-Step Guide
Reduce data warehouse costs in 6 steps: audit to find waste → enable auto-suspend → add partition pruning → eliminate SELECT * → right-size compute → set budget alerts. Teams with no prior optimization typically see 40–70% cost reduction from this first pass, with the majority coming from auto-suspend and partition pruning alone.
Audit: find where money is going
Query account usage views to identify the top 20 most expensive queries, users, and warehouses. You cannot optimize what you cannot measure. Run this audit first before making any changes.
-- Snowflake: top 20 queries by credit spend (last 14 days)
SELECT
query_text,
user_name,
warehouse_name,
warehouse_size,
ROUND(total_elapsed_time / 1000, 1) AS seconds,
ROUND(credits_used_cloud_services, 5) AS credits,
bytes_scanned / 1e9 AS gb_scanned
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -14, CURRENT_TIMESTAMP)
AND execution_status = 'SUCCESS'
ORDER BY credits DESC
LIMIT 20;
Enable auto-suspend on all warehouses
Set auto-suspend to 60 seconds for interactive warehouses, 5 minutes for scheduled ETL. This single change is the highest-ROI optimization in almost every Snowflake environment.
-- Snowflake: update all warehouses to auto-suspend
ALTER WAREHOUSE analytics_wh
SET AUTO_SUSPEND = 60 -- seconds of inactivity before suspend
AUTO_RESUME = TRUE; -- auto-resume on next query
ALTER WAREHOUSE etl_wh
SET AUTO_SUSPEND = 300; -- 5 min for scheduled ETL jobs
-- Verify current settings
SHOW WAREHOUSES;
-- Check: AUTO_SUSPEND column should be 60 or 300, not 0 (never suspend)
Add partitioning and eliminate full scans
Partition large tables by date. Add WHERE filters on the partition column to every query. In BigQuery, unfiltered queries scan the entire table — partitioning is mandatory for cost control at scale.
-- BigQuery: create partitioned + clustered table
CREATE TABLE analytics.orders
PARTITION BY DATE(created_date)
CLUSTER BY customer_id, status
AS SELECT * FROM analytics.orders_raw;
-- Every query must filter on the partition column
SELECT order_id, customer_id, amount_usd
FROM analytics.orders
WHERE created_date BETWEEN '2025-01-01' AND '2025-01-31' -- partition filter
AND status = 'completed'; -- cluster filter
-- Check bytes that would be billed before running:
-- Run with INFORMATION_SCHEMA.JOBS or dry-run in console
Eliminate SELECT * in all production queries
In columnar warehouses, SELECT * scans every column even when only a few are needed. Replace with explicit column lists in all dbt models and production queries.
-- BAD: scans all 200 columns even though you need 4
SELECT * FROM orders WHERE created_date = CURRENT_DATE;
-- GOOD: only the 4 needed columns are scanned
SELECT order_id, customer_id, amount_usd, status
FROM orders
WHERE created_date = CURRENT_DATE;
-- dbt: add to your project's sqlfluff or SQLFluff CI to block SELECT *
# .sqlfluff
[sqlfluff:rules:RF02]
# Disallow wildcard column references
Right-size warehouses to match SLAs
Test each scheduled pipeline at progressively smaller warehouse sizes. Use the smallest size where the pipeline meets its SLA. Document the minimum size for each pipeline.
-- Snowflake: test pipeline at multiple warehouse sizes
-- Step 1: tag your ETL queries
ALTER SESSION SET QUERY_TAG = 'etl_nightly_v1';
-- Step 2: run at S, M, L and compare
ALTER WAREHOUSE etl_wh SET WAREHOUSE_SIZE = 'SMALL';
-- run your pipeline...
ALTER WAREHOUSE etl_wh SET WAREHOUSE_SIZE = 'MEDIUM';
-- run again...
-- Step 3: compare results
SELECT warehouse_size,
AVG(total_elapsed_time) / 60000 AS avg_minutes,
AVG(credits_used_cloud_services) AS avg_credits
FROM snowflake.account_usage.query_history
WHERE query_tag = 'etl_nightly_v1'
GROUP BY 1
ORDER BY avg_credits;
-- Pick the smallest size where avg_minutes < SLA target
Set budget alerts and cost attribution
Create resource monitors that notify or suspend warehouses when spend exceeds a threshold. Tag all queries and warehouses by team so each team sees and owns their costs.
-- Snowflake: resource monitor with auto-suspend at 110% of budget
CREATE RESOURCE MONITOR analytics_budget
WITH CREDIT_QUOTA = 500 -- monthly credit budget
TRIGGERS
ON 75 PERCENT DO NOTIFY -- alert at 75%
ON 100 PERCENT DO NOTIFY -- alert at 100%
ON 110 PERCENT DO SUSPEND; -- auto-suspend at 110%
ALTER WAREHOUSE analytics_wh
SET RESOURCE_MONITOR = analytics_budget;
-- Tag queries by team for attribution
ALTER SESSION SET QUERY_TAG = 'team:marketing|pipeline:daily_report';
Expected Savings by Technique
| Technique | Typical savings | Effort |
|---|---|---|
| Auto-suspend (60s) | 30–50% compute cost | Low — 1 ALTER per warehouse |
| Partition pruning | 50–90% scan cost on large tables | Medium — add filters to queries |
| Eliminate SELECT * | 30–80% column scan cost | Medium — update all dbt models |
| Warehouse right-sizing | 25–60% compute cost | Medium — test and document SLAs |
| Storage tiering (cold) | 50–80% storage cost on old data | Low — set lifecycle policy |
| Budget alerts | 0% direct — prevents overruns | Low — 1 resource monitor setup |
Common Issues
Auto-suspend breaks streaming or always-on use cases
Warehouses that serve real-time dashboards or always-on applications should use a longer auto-suspend (10–15 min) or dedicated always-on warehouses. Never set 0 (no suspend) — always have a safety net.
Partition pruning not working — no savings
Check your WHERE clause includes the exact partition column, not a derived expression. In BigQuery: WHERE DATE(created_at) = '2025-01-01' does NOT prune partitions. Use WHERE created_date = '2025-01-01' on a DATE partition column.
Budget alerts fire but no one acts
Alerts only reduce costs if someone responds. Route alerts to a Slack channel with an on-call rotation and a runbook: "when this fires, suspend warehouse X and file a cost review ticket."
FAQ
- What is the fastest way to reduce Snowflake costs?
- Enable auto-suspend at 60 seconds on all warehouses. This alone reduces most teams' Snowflake bill by 30–50% immediately by eliminating idle compute.
- How do you reduce BigQuery costs?
- Partition tables by date, always filter on the partition column, and never use SELECT *. These three changes address the root cause of BigQuery overspend: scanning too much data.
- How much can you realistically save?
- 40–70% from the first optimization pass is typical. Auto-suspend + partition pruning + column projection compound: each reduces a different cost component independently.