# Data governance operating model rollout checklist

Run in order for a **federated data governance** program on AWS. Assumes **SageMaker Catalog**
(built on Amazon DataZone) as the 2026 business catalog surface, **AWS Glue** for technical
metadata, and **Lake Formation** for enforcement.

> Reflects **July 2026**: SageMaker Catalog unifies data + ML asset governance; DataZone
> experience remains for existing DataZone-only tenants; cross-account sharing via Lake
> Formation v5 (Feb 2026).

## Stage 0 — Charter (required)

- [ ] Name executive sponsor and data governance council (monthly cadence)
- [ ] Define 3–5 priority domains (e.g., finance, product, marketing, supply chain)
- [ ] Publish RACI using [stewardship-raci.csv](./stewardship-raci.csv)
- [ ] Set subscription SLA (e.g., approve within 2 business days)

**Rollback trigger:** No named owners → stop; catalog without stewardship becomes shelfware.

## Stage 1 — Technical foundation

- [ ] Glue Data Catalog registered per account/region; crawler scope limited to owned databases
- [ ] Lake Formation data lake settings enabled; IAMAllowedPrincipals removed where possible
- [ ] LF-Tags taxonomy drafted (sensitivity, domain, cost-center)
- [ ] Macie classification jobs on S3 landing buckets (weekly)

**Rollback trigger:** LF-Tags applied without steward review → access tickets flood IT.

## Stage 2 — Business catalog (SageMaker Catalog)

- [ ] Create SageMaker domain + Catalog project per business domain
- [ ] Import business glossary terms (owner + steward per term)
- [ ] Publish owned assets from Glue tables and SageMaker feature groups
- [ ] Enable subscription workflow with steward approval

**Rollback trigger:** >30% subscription requests denied without documented reason → fix glossary gaps.

## Stage 3 — Quality and lineage

- [ ] Glue Data Quality rules on bronze → silver pipelines (row count, null rate, schema)
- [ ] Lineage captured for Glue jobs and Athena views (OpenLineage or native Glue lineage)
- [ ] Drift alarms on schema changes in producer accounts

**Rollback trigger:** Quality failures block downstream dashboards → establish waiver process.

## Stage 4 — Operating metrics

- [ ] Track: time-to-approve subscription, % assets with owner tag, glossary coverage
- [ ] Monthly council review: denied requests, orphan tables, Macie high-severity findings
- [ ] Link chargeback tags to CUR for data platform cost per domain

## Related posts

- [Data governance operating model on AWS](/blog/aws-data-governance-operating-model-sagemaker-catalog-2026/)
- [Amazon DataZone enterprise governance](/blog/amazon-datazone-enterprise-governance/)
- [Cross-account data sharing with Lake Formation](/blog/aws-secure-cross-account-data-sharing-lake-formation-2026/)
- [SageMaker Unified Studio](/blog/amazon-sagemaker-unified-studio/)
