AWS Glossary
Amazon S3 Tables
S3 Tables are managed Apache Iceberg tables on S3 — purpose-built table buckets with auto-compaction, snapshot management, and up to 3× better query performance than self-managed Iceberg on standard S3.
AI & assistant-friendly summary
This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.
Summary
S3 Tables are managed Apache Iceberg tables on S3 — purpose-built table buckets with auto-compaction, snapshot management, and up to 3× better query performance than self-managed Iceberg on standard S3.
Key Facts
- • Definition Amazon S3 Tables are managed Apache Iceberg tables that live in a new bucket type called a _table bucket_
- • AWS handles table maintenance — compaction, snapshot expiration, unreferenced file cleanup — that data engineers traditionally script themselves
- • S3 Tables reached GA at re:Invent 2024 and integrate natively with Athena, Redshift, EMR, Glue, and Amazon SageMaker Lakehouse
- • Common mistakes **Mistake 1:** Putting general-purpose object data in a table bucket
- • Table buckets are optimized for Iceberg tables — use a regular S3 bucket for media, logs, or backups
Entity Definitions
- SageMaker
- SageMaker is an AWS service relevant to amazon s3 tables.
- Amazon SageMaker
- Amazon SageMaker is an AWS service relevant to amazon s3 tables.
- S3
- S3 is an AWS service relevant to amazon s3 tables.
- Amazon S3
- Amazon S3 is an AWS service relevant to amazon s3 tables.
- IAM
- IAM is an AWS service relevant to amazon s3 tables.
- Glue
- Glue is an AWS service relevant to amazon s3 tables.
- AWS Glue
- AWS Glue is an AWS service relevant to amazon s3 tables.
- Athena
- Athena is an AWS service relevant to amazon s3 tables.
- Amazon Athena
- Amazon Athena is an AWS service relevant to amazon s3 tables.
- serverless
- serverless is a cloud computing concept relevant to amazon s3 tables.
Related Content
- AWS DATA ANALYTICS — Related service
Definition
Amazon S3 Tables are managed Apache Iceberg tables that live in a new bucket type called a table bucket. AWS handles table maintenance — compaction, snapshot expiration, unreferenced file cleanup — that data engineers traditionally script themselves. S3 Tables reached GA at re:Invent 2024 and integrate natively with Athena, Redshift, EMR, Glue, and Amazon SageMaker Lakehouse.
Why S3 Tables vs raw Iceberg on S3
| Aspect | Self-managed Iceberg on S3 | S3 Tables |
|---|---|---|
| Compaction | You run Glue/Spark jobs on a schedule | Automatic, managed |
| Snapshot expiration | You maintain | Automatic |
| Query performance | Baseline | Up to 3× faster on selective queries |
| Transactions / sec | S3 baseline | Up to 10× higher writes/sec |
| Catalog | Bring your own (Glue Catalog) | Native catalog or AWS Glue Data Catalog |
Capabilities
- ACID transactions — Iceberg’s snapshot model with concurrent writers.
- Schema evolution — Add, drop, rename columns without rewriting data.
- Hidden partitioning — Partition transforms (year/month/day) hidden from query authors.
- Time travel — Query historical snapshots with
FOR TIMESTAMP AS OF. - Cross-engine reads — Athena, Redshift Spectrum, EMR, Glue, Snowflake (with external table), Databricks (Unity Catalog integration).
Pricing model
S3 Tables charge for:
- Storage (per GB-month, similar to S3 Standard)
- PUT / GET requests (per 1,000)
- Compaction (per object processed by the managed maintenance job)
- Monitoring (per 1,000 objects monitored)
For most analytical workloads the managed maintenance overhead is offset by query-performance gains; benchmark on your dataset before committing.
Common mistakes
Mistake 1: Putting general-purpose object data in a table bucket. Table buckets are optimized for Iceberg tables — use a regular S3 bucket for media, logs, or backups.
Mistake 2: Forgetting to set table-level retention. Iceberg snapshots accumulate forever by default. Configure snapshot expiration policies up front.
Mistake 3: Granting overly broad table-bucket permissions. Use IAM table-level grants (the new bucket type supports table-scoped permissions) rather than full-bucket access.
Related AWS Services
- AWS Glue Data Catalog — Central metastore for cross-engine reads
- Amazon Athena — Serverless SQL queries on S3 Tables
- Amazon Redshift — Spectrum reads on S3 Tables
- Amazon EMR / EMR Serverless — Spark, Trino, Flink workloads on S3 Tables
- Amazon SageMaker Lakehouse — Unified Iceberg + warehouse access for ML
Related FactualMinds Content
Related Services
Need Help with This Topic?
Our AWS experts can help you implement and optimize these concepts for your organization.