What is Apache Iceberg?
The Open Table Format
Iceberg brings ACID transactions, time travel, schema evolution, and hidden partitioning to data lakes on any object storage — without locking you into a single query engine.
Quick Answer
Apache Iceberg is an open table format that adds a metadata layer on top of Parquet files in object storage (S3, GCS, ADLS). It enables ACID transactions, time travel via snapshots, schema evolution without rewrites, and hidden partitioning — making your data lake behave like a data warehouse while keeping data in open formats that any engine can read.
What is Apache Iceberg?
Iceberg was created at Netflix in 2017 to solve the limitations of Hive table format at scale: slow partition discovery, unsafe concurrent writes, and inability to change schemas without rewriting data. It was open-sourced and donated to the Apache Software Foundation in 2018. Today it is the dominant open table format, with native support in Spark, Flink, Trino, DuckDB, Snowflake, BigQuery, and Redshift.
Hive Table Format
The Old Way
Partitions are directories. Schema changes rewrite data. No ACID — concurrent writers corrupt tables. No time travel. Partition pruning requires explicit WHERE clauses. Slow partition discovery on large tables.
Iceberg Table Format
The Open Standard
Metadata tree enables fast planning even on tables with millions of files. ACID via optimistic concurrency control. Schema evolution without rewrites. Time travel via snapshots. Hidden partitioning — no manual partition filters needed.
Why Iceberg Matters
Before Iceberg (Hive)
- • Concurrent writes corrupt tables — no ACID
- • Schema changes require full table rewrites
- • Queries must filter on partition columns explicitly
- • No time travel — data overwritten is gone forever
- • Partition discovery scans all directories on every query
- • Rename a column → break all downstream queries
With Iceberg
- • ACID transactions via optimistic concurrency control
- • Add, drop, rename columns — no data rewrite
- • Hidden partitioning — automatic partition pruning
- • Time travel: query any past snapshot by timestamp or ID
- • Metadata tree: O(1) planning regardless of file count
- • Works with Spark, Flink, Trino, DuckDB, Snowflake
What You Can Build with Iceberg
Iceberg is the foundation of the modern lakehouse.
Multi-Engine Lakehouse
Ingest with Flink, transform with Spark, query with Trino and DuckDB — all on the same Iceberg tables. No proprietary format lock-in.
CDC + Row-Level Updates
Apply INSERT, UPDATE, DELETE, and MERGE INTO on lake data. Iceberg's copy-on-write and merge-on-read modes handle high-frequency CDC from Debezium.
Audit & Compliance
Time travel provides an immutable audit trail. Query data as it existed on any past date for regulatory compliance, GDPR deletion verification, or incident forensics.
Data Lakehouse with Catalog
Use AWS Glue, Nessie, or Polaris as a catalog. Tables are discoverable across engines and environments with consistent schema and partition metadata.
Streaming + Batch Unified
Flink writes streaming micro-batches to Iceberg; Spark reads the same table for batch analytics. No ETL between streaming and batch layers.
Safe Schema Migration
Evolve schemas on petabyte-scale tables without downtime. Add nullable columns, rename fields, reorder columns — all metadata-only operations that complete in milliseconds.
How Iceberg Works
Iceberg is a three-layer metadata hierarchy that sits on top of your data files.
CATALOG
- Table pointer
- Current metadata
- Glue / Nessie / REST
- Engine discovery
METADATA
- Snapshot log
- Schema history
- Partition spec
- Sort order
MANIFESTS
- Manifest list
- Manifest files
- File stats & bounds
- Partition data
DATA FILES
- Parquet / ORC
- S3 / GCS / ADLS
- Immutable files
- Delete files
-- Create an Iceberg table with hidden partitioning
CREATE TABLE catalog.db.events (
event_id BIGINT,
user_id BIGINT,
event_type STRING,
event_ts TIMESTAMP
)
USING iceberg
PARTITIONED BY (days(event_ts)) -- hidden partition
-- Time travel: query yesterday's snapshot
SELECT * FROM catalog.db.events
TIMESTAMP AS OF '2026-03-22 00:00:00'
-- Schema evolution: add column (no rewrite)
ALTER TABLE catalog.db.events
ADD COLUMN session_id STRINGIceberg vs Delta Lake vs Hudi
Apache Iceberg
- • Engine-agnostic: Spark, Flink, Trino, DuckDB, Snowflake
- • Metadata tree for fast planning on huge tables
- • Partition evolution without data rewrites
- • Strong multi-engine community (Apple, Netflix, AWS)
- • REST catalog standard emerging as universal interface
Delta Lake
- • Deep Databricks/Spark integration, DeltaSharing
- • Liquid clustering (auto-optimized layout)
- • Strong in Databricks-centric organizations
- • Delta Universal Format (UniForm) adds Iceberg compat
- • Growing multi-engine support but Spark-first heritage
Verdict: Choose Iceberg for multi-engine architectures and open interoperability. Choose Delta Lake if you are all-in on Databricks. Both are production-ready; the choice depends on your query engine mix, not technical quality.
| Feature | Iceberg | Delta Lake | Hudi |
|---|---|---|---|
| ACID transactions | ✓ | ✓ | ✓ |
| Time travel | ✓ snapshots | ✓ versions | ✓ timeline |
| Schema evolution | ✓ all ops | ✓ most ops | ✓ limited |
| Hidden partitioning | ✓ | ✗ | ✗ |
| Partition evolution | ✓ no rewrite | ✗ (Liquid) | ✓ limited |
| Multi-engine support | Best | Growing | Limited |
| Row-level deletes | ✓ CoW / MoR | ✓ | ✓ MoR-first |
Common Mistakes
Not running compaction
Iceberg writes small files on every commit (especially from streaming). Without regular compaction (rewriting small files into larger ones), query performance degrades. Run CALL catalog.system.rewrite_data_files() on a schedule.
Forgetting to expire snapshots
Every write creates a new snapshot. Old snapshots keep data files alive that cannot be garbage collected. Run CALL catalog.system.expire_snapshots() weekly to free up storage and reduce metadata size.
Using wrong write mode for CDC
Copy-on-write (CoW) rewrites entire data files on every update — fast reads, slow writes. Merge-on-read (MoR) appends delete files — fast writes, slightly slower reads. Choose based on your update frequency.
Choosing partition spec before understanding query patterns
Partitioning by days(event_ts) is useless if all queries filter by user_id. Profile your queries first, then define the partition spec. Iceberg's partition evolution lets you change it later without a rewrite.
Skipping the catalog
Running Iceberg without a catalog (Hadoop catalog, file-based) limits you to one engine. Use a REST catalog (AWS Glue, Nessie, Polaris) to share tables across Spark, Trino, and DuckDB simultaneously.
Who Should Learn Iceberg?
Junior Engineer
Understand the lakehouse
Learn the Iceberg metadata model, create tables, run time travel queries, and understand why hidden partitioning matters. Foundation for any data lake role.
Senior Engineer
Build production lakehouses
Design partition strategies, implement CDC pipelines with MERGE INTO, tune compaction and expiry schedules, and configure multi-engine catalogs (Glue, Nessie).
Staff / Architect
Define table format strategy
Choose between Iceberg, Delta, and Hudi for the org. Design catalog architecture, govern schema evolution policies, and migrate legacy Hive tables to Iceberg at scale.
Related Concepts
FAQ
- What is Apache Iceberg?
- Apache Iceberg is an open table format that adds ACID transactions, time travel, schema evolution, and hidden partitioning to data lakes on object storage (S3, GCS, ADLS). It works with Spark, Flink, Trino, DuckDB, Snowflake, and BigQuery.
- What is the difference between Apache Iceberg and Delta Lake?
- Both add ACID and time travel to data lakes. Iceberg is engine-agnostic with a metadata tree for fast planning — preferred for multi-engine architectures. Delta Lake has deep Databricks/Spark integration — preferred for Databricks-centric shops. Both are production-ready.
- How does Iceberg time travel work?
- Iceberg keeps an immutable snapshot for every write. You can query past snapshots with AS OF TIMESTAMP or AS OF VERSION (snapshot ID) for auditing, debugging, or data recovery.
- What is hidden partitioning in Iceberg?
- Hidden partitioning lets you define partition specs (e.g. month(event_date)) without requiring query filters on the partition column. Iceberg applies pruning automatically, eliminating the #1 cause of slow queries on Hive-style tables.
- What is schema evolution in Iceberg?
- Schema evolution lets you add, drop, rename, or reorder columns without rewriting data files. Iceberg uses column IDs internally (not names), so renaming a column never breaks existing files or queries.
What You'll Build with AI-DE
The Iceberg Lakehouse project builds a production-grade open lakehouse from scratch — catalog setup through streaming CDC ingestion and multi-engine query federation.
- • Part 1: Catalog setup, first Iceberg tables, schema evolution, hidden partitioning
- • Part 2: MERGE INTO for CDC, copy-on-write vs merge-on-read, compaction strategy
- • Part 3: Flink streaming writes, Spark batch reads, Trino ad-hoc queries — same tables
- • Part 4: Time travel audit queries, snapshot expiry, data lifecycle governance