Infrastructure as Code
Terraform on AWS
Deploy AWS infrastructure reliably with Terraform — Stacks, ephemeral values, provider-defined functions, Test Framework, and OpenTofu state encryption for teams that need an OSI-licensed alternative.
Last updated:April 29, 2026Author:FactualMinds Cloud Integration TeamReviewed by:FactualMinds AWS-certified architects (Solutions Architect – Professional)
AI & assistant-friendly summary
This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.
Summary
Terraform + AWS in 2026: Stacks GA, ephemeral values, provider-defined functions, Test Framework, OpenTofu 1.8 encryption — vs CDK and CloudFormation.
Key Facts
- • Terraform + AWS in 2026: Stacks GA, ephemeral values, provider-defined functions, Test Framework, OpenTofu 1
- • 8 encryption — vs CDK and CloudFormation
- • Deploy AWS infrastructure reliably with Terraform — Stacks, ephemeral values, provider-defined functions, Test Framework, and OpenTofu state encryption for teams that need an OSI-licensed alternative
- • Ephemeral values (Terraform 1
- • Ephemeral resources (1
Entity Definitions
- Bedrock
- Bedrock is relevant to terraform on aws.
- Lambda
- Lambda is relevant to terraform on aws.
- S3
- S3 is relevant to terraform on aws.
- RDS
- RDS is relevant to terraform on aws.
- DynamoDB
- DynamoDB is relevant to terraform on aws.
- CloudFront
- CloudFront is relevant to terraform on aws.
- CloudWatch
- CloudWatch is relevant to terraform on aws.
- IAM
- IAM is relevant to terraform on aws.
- VPC
- VPC is relevant to terraform on aws.
- EKS
- EKS is relevant to terraform on aws.
- Glue
- Glue is relevant to terraform on aws.
- Secrets Manager
- Secrets Manager is relevant to terraform on aws.
- AWS Secrets Manager
- AWS Secrets Manager is relevant to terraform on aws.
- Parameter Store
- Parameter Store is relevant to terraform on aws.
- CodeBuild
- CodeBuild is relevant to terraform on aws.
## Terraform on AWS in 2026
Terraform is still the default IaC tool on AWS for enterprise teams. What changed over the last two years is (a) Terraform Stacks going GA on HCP Terraform, (b) ephemeral values and ephemeral resources finally removing plaintext secrets from state files, (c) provider-defined functions cleaning up a lot of HCL gymnastics, (d) a production-ready Test Framework, and (e) OpenTofu maturing as a credible OSI-licensed alternative with client-side state encryption.
This page is a working guide to the 2026 configuration we ship.
> **Licensing in one paragraph**: Terraform moved from MPL 2.0 to the Business Source License (BUSL 1.1) in August 2023. **IBM completed its acquisition of HashiCorp in early 2025**; Terraform and HCP Terraform continue under the HashiCorp brand inside IBM Software. **OpenTofu** is the Linux Foundation / CNCF fork on MPL 2.0 — functionally compatible with Terraform for most AWS workflows, increasingly diverging on advanced features (Stacks, Sentinel). Pick based on license policy, HCP usage, and roadmap needs.
## What's new for Terraform on AWS in 2026
- **Terraform Stacks (GA on HCP Terraform, 2025)** — coordinated multi-configuration deployments across accounts and regions.
- **Ephemeral values (1.10+) and ephemeral resources (1.11+)** — never land in state.
- **Provider-defined functions (1.8+)** — cleaner HCL ergonomics.
- **Test Framework maturity (1.6 → 1.10+)** — production-ready module testing.
- **S3-native state locking (1.10+)** — DynamoDB lock table now optional.
- **Provider-defined validators** and richer CLI diagnostics.
- **HCP Terraform** — Stacks, Sentinel, OIDC to AWS, private registry, audit.
- **OpenTofu 1.8+** — client-side state encryption, early provider innovations, community governance.
- **AWS provider v6** — continued coverage for new AWS services (Bedrock, S3 Tables, EKS Auto Mode, VPC Lattice, Verified Access).
## Why Terraform on AWS
- **Reproducibility** — the same module deploys identical infra in every account and region.
- **Governance** — Sentinel / OPA / HCP Terraform policy enforcement on every plan.
- **Scalability** — modules and Stacks let one platform team serve dozens of product teams.
- **Drift detection** — nightly plans catch console drift before auditors do.
- **Ecosystem** — richest provider ecosystem of any IaC tool; community modules cover most AWS services.
## Account and state architecture (landing-zone first)
- **AWS Organizations + Control Tower** owns account creation and baseline guardrails.
- **IAM Identity Center** replaces long-lived IAM users; Terraform runs via OIDC with short-lived credentials.
- **One AWS account per environment × workload** (or per business unit × environment).
- **One state key per (account, workload)** in an S3 backend inside that account.
- **S3 state bucket**: versioning on, KMS CMK encryption, Block Public Access, TLS-only bucket policy, Object Lock where compliance mandates it.
- **Locking**: S3-native (Terraform 1.10+) or DynamoDB lock table.
- **Cross-workspace references**: Terraform Stacks, not shared state files.
## Core Terraform concepts (refresher)
- **Providers** — `hashicorp/aws` for AWS APIs; pin with `~> 6.0` (or whatever current major) and upgrade deliberately.
- **Resources / data sources** — managed vs read-only views of infra.
- **Ephemeral values and resources** — new in 1.10+, never persisted.
- **Modules** — small, focused, versioned via the registry or a private registry on HCP Terraform.
- **Stacks** — HCP Terraform, coordinate multiple configs.
- **State** — remote, encrypted, locked, never in Git.
## Example: remote state with S3-native locking
```hcl
terraform {
required_version = ">= 1.10"
backend "s3" {
bucket = "acme-prod-tfstate"
key = "platform/eks/terraform.tfstate"
region = "eu-west-1"
encrypt = true
kms_key_id = "arn:aws:kms:eu-west-1:111122223333:key/…"
use_lockfile = true
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.0"
}
}
}
```
## Example: ephemeral value for a bootstrap password
```hcl
ephemeral "random_password" "bootstrap" {
length = 32
special = true
}
resource "aws_secretsmanager_secret_version" "initial" {
secret_id = aws_secretsmanager_secret.db.id
secret_string = ephemeral.random_password.bootstrap.result
lifecycle {
ignore_changes = [secret_string]
}
}
```
The generated password exists during the apply, writes into Secrets Manager, and is then gone — not in state, not in the plan, not in logs.
## Example: provider-defined function
```hcl
locals {
arn_parts = provider::aws::arn_parse(aws_s3_bucket.this.arn)
account_id = local.arn_parts.account_id
}
```
No more regexes over ARNs.
## Terraform Stacks (HCP Terraform)
- Define components (configurations) and deployments (targets).
- Reference components from each other with typed outputs.
- Deploy an entire environment — VPC → EKS → workloads → observability — in one orchestrated apply.
- Roll out changes across environments with deployment strategies (percentage, wave, approval gates).
- Sentinel policies gate every step.
Use when you have three or more coupled configurations deploying together. For single-state workloads, plain workspaces are still fine.
## Test Framework
```hcl
# tests/s3.tftest.hcl
run "s3_bucket_is_private" {
command = plan
assert {
condition = aws_s3_bucket_public_access_block.this.block_public_acls == true
error_message = "S3 bucket must block public ACLs"
}
assert {
condition = aws_s3_bucket_server_side_encryption_configuration.this.rule[0].apply_server_side_encryption_by_default.sse_algorithm == "aws:kms"
error_message = "S3 bucket must be encrypted with KMS"
}
}
```
Run in CI on every PR. Pair with `tflint`, `tfsec` / `checkov`, and Sentinel (HCP) or OPA.
## OpenTofu 1.8+
- **Client-side state encryption** — encrypt state at rest with keys OpenTofu never sends to its backend.
- **Early provider work** and its own registry.
- **Governance**: Linux Foundation / CNCF.
- **Compatibility**: tracks Terraform core features where licensing and design allow; Stacks and Sentinel are Terraform-only.
Adopt OpenTofu when OSI licensing is non-negotiable or you want client-side state encryption today.
## Terraform vs AWS CDK vs CloudFormation (quick matrix)
| Dimension | Terraform | AWS CDK | CloudFormation / SAM | Pulumi |
| --------------- | -------------------------------------- | ---------------------------------------- | ------------------------- | ---------------------------------------- |
| Language | HCL | TS / Python / Java / .NET / Go | YAML / JSON | TS / Python / Go / .NET |
| Scope | Multi-cloud | AWS-only (CloudFormation under the hood) | AWS-only | Multi-cloud |
| State | External (S3 / HCP) | CloudFormation | CloudFormation | External |
| Policy engine | Sentinel / OPA | Custom Aspects | CloudFormation Guard | CrossGuard / OPA |
| Drift detection | `terraform plan` | CloudFormation Drift | CloudFormation Drift | `pulumi preview` |
| Ecosystem | Largest | Growing | Native AWS coverage | Growing |
| Good fit | Enterprise multi-cloud, HCP governance | AWS-only dev teams wanting TypeScript | Minimal-tooling AWS shops | Teams wanting CDK-style with multi-cloud |
## Failure modes & resilience
**1. AWS provider rate limits / throttling.** `hashicorp/aws` calls the AWS API like any other client; large `apply` runs hit `Throttling`, `RequestLimitExceeded`, or `TooManyRequestsException` on Organizations, IAM, and Lambda APIs. Configure exponential backoff via `max_retries` and split monolithic state files:
```hcl
provider "aws" {
region = var.aws_region
max_retries = 25 # default 25; bump only for known burst-heavy workloads
default_tags {
tags = local.common_tags
}
}
```
For Organizations / Control Tower management, throttle parallelism: `terraform apply -parallelism=5` (default 10). Apply runs against AWS Organizations APIs are particularly sensitive.
**2. Partial-apply rollback.** Terraform does not roll back on partial failure — the failed resource is left as-is, and state reflects what succeeded. Recovery: read the error, fix the cause, re-run `apply`. For the high-stakes case where you cannot tolerate forward recovery, gate apply behind a `taint`/`untaint` workflow and add `prevent_destroy` lifecycle blocks on critical infra (RDS, KMS keys, S3 buckets with Object Lock).
**3. State corruption recovery.** S3 versioning is your safety net. Recovery flow: `aws s3api list-object-versions --bucket <state-bucket> --prefix <key>` → identify last-good version → `aws s3api copy-object` → re-acquire lock → `terraform refresh`. KMS-CMK rotation does NOT impact existing state reads (KMS retains old material for decrypt) but rotating to a new key alias requires re-encrypting state via `terraform init -reconfigure`.
**4. Drift between console and state.** Inevitable in regulated multi-team accounts. Run nightly `terraform plan -detailed-exitcode` in CI; non-zero exit means drift. Auto-import via `terraform plan -generate-config-out=` (1.5+) for known-safe drift; ticket the rest.
**5. Provider major-version upgrades.** `hashicorp/aws` v5→v6 changed default behavior on several resources (`aws_s3_bucket_versioning`, `aws_lb` defaults). Always test in a staging account; use `~> 6.0` to allow patch upgrades only.
**6. Cross-region operations.** `aws_route53_record`, ACM certs in `us-east-1` for CloudFront, and `aws_s3_bucket_replication_configuration` need provider aliases. Forgetting an alias silently creates resources in the default region.
## Multi-region rollout with Stacks
```hcl
# stack.tfdeploy.hcl
deployment "us_east_1" {
inputs = { region = "us-east-1", weight = 100 }
}
deployment "eu_west_1" {
inputs = { region = "eu-west-1", weight = 100 }
depends_on = [deployment.us_east_1]
}
orchestrate "rolling" {
check {
condition = context.plan.changes.add < 50
reason = "Plan changed too many resources; manual review required."
}
}
```
Stacks deployments run sequentially with explicit `depends_on`; orchestration checks gate progression. For wave-based rollouts across many regions, group deployments into waves with manual approval gates between them.
## Observability runbook
**CI signals:**
| Signal | Action |
| ------------------------------------------------ | ------------------------------------------------------------------------------------ |
| Nightly `terraform plan` exit code 2 (drift) | Auto-create ticket; review before next deploy; `terraform import` if benign |
| `apply` duration `> 30 min` | Split state; check for `aws_route53_*` deletes (slow), ASG drains, RDS modifications |
| `Throttling` / `RateExceeded` in plan/apply logs | Reduce `-parallelism`, bump `max_retries`, split state, request limit increase |
| HCP Terraform run failure rate `> 5%` | Investigate Sentinel policy regressions, provider version skew |
| State lock held `> 1 hr` | `terraform force-unlock <lock-id>` only after confirming no active apply in CI logs |
**Drift alarm wiring (CloudWatch Logs → Metric Filter → Alarm):**
```bash
aws logs put-metric-filter \
--log-group-name "/aws/codebuild/terraform-drift" \
--filter-name terraform-drift-detected \
--filter-pattern '"Plan: " "to add" "to change" "to destroy"' \
--metric-transformations metricName=DriftCount,metricNamespace=Terraform,metricValue=1
```
Then alarm on `Terraform/DriftCount > 0` and route to PagerDuty / Slack.
## Common pitfalls (field-tested)
- **Manual console changes alongside Terraform** — run nightly `terraform plan` as drift detection; use `terraform import` to reconcile.
- **Committing state or `*.tfvars`** — `.gitignore` them; store secrets in Secrets Manager, pull via ephemeral values.
- **Monolithic configurations** — break into modules; giant single-state plans are slow and risky.
- **Skipping plan review** — require PR-level plan review; gate apply behind approval.
- **Unpinned providers** — pin `hashicorp/aws` with a major-version constraint; upgrade deliberately.
- **Long-lived IAM access keys in CI** — use OIDC federation instead.
## When Terraform is NOT the best fit
- Pure AWS-only team already deep in CDK with CDK Pipelines and application-language colocation.
- Very small team, one environment, one workload — CloudFormation/SAM templates are lighter.
- Workloads that are 95% managed AWS services with no Kubernetes, no DNS-external, no SaaS providers — CDK or CloudFormation may be enough.
- Strict air-gapped environment with no provider registry access — evaluate Terraform Enterprise / OpenTofu + mirrored registry.
## Related reading
- [`Terraform AWS provider upgrade strategy`](/blog/terraform-aws-provider-upgrade-strategy/)
- [`Terraform state management on AWS: import, move, repair`](/blog/terraform-state-management-aws-import-move-repair/)
- [`Safe Terraform apply workflows with approval gates on AWS`](/blog/safe-terraform-apply-workflows-approval-gates-aws/)
- [`Migrating from Terraform to OpenTofu on AWS`](/blog/migrate-terraform-opentofu-aws/)
- [`Terraform vs AWS CDK: IaC decision guide`](/blog/terraform-vs-aws-cdk-infrastructure-as-code-decision-guide/)
- [`AWS infrastructure drift detection with Terraform`](/blog/aws-infrastructure-drift-detection-terraform/)
## Related services
- [AWS Architecture Review](/services/aws-architecture-review/)
- [DevOps Pipeline Setup](/services/devops-pipeline-setup/)
- [AWS Application Modernization](/services/aws-application-modernization/) Terraform on AWS in 2026
Terraform is still the default IaC tool on AWS for enterprise teams. What changed over the last two years is (a) Terraform Stacks going GA on HCP Terraform, (b) ephemeral values and ephemeral resources finally removing plaintext secrets from state files, (c) provider-defined functions cleaning up a lot of HCL gymnastics, (d) a production-ready Test Framework, and (e) OpenTofu maturing as a credible OSI-licensed alternative with client-side state encryption.
This page is a working guide to the 2026 configuration we ship.
Licensing in one paragraph: Terraform moved from MPL 2.0 to the Business Source License (BUSL 1.1) in August 2023. IBM completed its acquisition of HashiCorp in early 2025; Terraform and HCP Terraform continue under the HashiCorp brand inside IBM Software. OpenTofu is the Linux Foundation / CNCF fork on MPL 2.0 — functionally compatible with Terraform for most AWS workflows, increasingly diverging on advanced features (Stacks, Sentinel). Pick based on license policy, HCP usage, and roadmap needs.
What’s new for Terraform on AWS in 2026
- Terraform Stacks (GA on HCP Terraform, 2025) — coordinated multi-configuration deployments across accounts and regions.
- Ephemeral values (1.10+) and ephemeral resources (1.11+) — never land in state.
- Provider-defined functions (1.8+) — cleaner HCL ergonomics.
- Test Framework maturity (1.6 → 1.10+) — production-ready module testing.
- S3-native state locking (1.10+) — DynamoDB lock table now optional.
- Provider-defined validators and richer CLI diagnostics.
- HCP Terraform — Stacks, Sentinel, OIDC to AWS, private registry, audit.
- OpenTofu 1.8+ — client-side state encryption, early provider innovations, community governance.
- AWS provider v6 — continued coverage for new AWS services (Bedrock, S3 Tables, EKS Auto Mode, VPC Lattice, Verified Access).
Why Terraform on AWS
- Reproducibility — the same module deploys identical infra in every account and region.
- Governance — Sentinel / OPA / HCP Terraform policy enforcement on every plan.
- Scalability — modules and Stacks let one platform team serve dozens of product teams.
- Drift detection — nightly plans catch console drift before auditors do.
- Ecosystem — richest provider ecosystem of any IaC tool; community modules cover most AWS services.
Account and state architecture (landing-zone first)
- AWS Organizations + Control Tower owns account creation and baseline guardrails.
- IAM Identity Center replaces long-lived IAM users; Terraform runs via OIDC with short-lived credentials.
- One AWS account per environment × workload (or per business unit × environment).
- One state key per (account, workload) in an S3 backend inside that account.
- S3 state bucket: versioning on, KMS CMK encryption, Block Public Access, TLS-only bucket policy, Object Lock where compliance mandates it.
- Locking: S3-native (Terraform 1.10+) or DynamoDB lock table.
- Cross-workspace references: Terraform Stacks, not shared state files.
Core Terraform concepts (refresher)
- Providers —
hashicorp/awsfor AWS APIs; pin with~> 6.0(or whatever current major) and upgrade deliberately. - Resources / data sources — managed vs read-only views of infra.
- Ephemeral values and resources — new in 1.10+, never persisted.
- Modules — small, focused, versioned via the registry or a private registry on HCP Terraform.
- Stacks — HCP Terraform, coordinate multiple configs.
- State — remote, encrypted, locked, never in Git.
Example: remote state with S3-native locking
terraform {
required_version = ">= 1.10"
backend "s3" {
bucket = "acme-prod-tfstate"
key = "platform/eks/terraform.tfstate"
region = "eu-west-1"
encrypt = true
kms_key_id = "arn:aws:kms:eu-west-1:111122223333:key/…"
use_lockfile = true
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.0"
}
}
}
Example: ephemeral value for a bootstrap password
ephemeral "random_password" "bootstrap" {
length = 32
special = true
}
resource "aws_secretsmanager_secret_version" "initial" {
secret_id = aws_secretsmanager_secret.db.id
secret_string = ephemeral.random_password.bootstrap.result
lifecycle {
ignore_changes = [secret_string]
}
}
The generated password exists during the apply, writes into Secrets Manager, and is then gone — not in state, not in the plan, not in logs.
Example: provider-defined function
locals {
arn_parts = provider::aws::arn_parse(aws_s3_bucket.this.arn)
account_id = local.arn_parts.account_id
}
No more regexes over ARNs.
Terraform Stacks (HCP Terraform)
- Define components (configurations) and deployments (targets).
- Reference components from each other with typed outputs.
- Deploy an entire environment — VPC → EKS → workloads → observability — in one orchestrated apply.
- Roll out changes across environments with deployment strategies (percentage, wave, approval gates).
- Sentinel policies gate every step.
Use when you have three or more coupled configurations deploying together. For single-state workloads, plain workspaces are still fine.
Test Framework
# tests/s3.tftest.hcl
run "s3_bucket_is_private" {
command = plan
assert {
condition = aws_s3_bucket_public_access_block.this.block_public_acls == true
error_message = "S3 bucket must block public ACLs"
}
assert {
condition = aws_s3_bucket_server_side_encryption_configuration.this.rule[0].apply_server_side_encryption_by_default.sse_algorithm == "aws:kms"
error_message = "S3 bucket must be encrypted with KMS"
}
}
Run in CI on every PR. Pair with tflint, tfsec / checkov, and Sentinel (HCP) or OPA.
OpenTofu 1.8+
- Client-side state encryption — encrypt state at rest with keys OpenTofu never sends to its backend.
- Early provider work and its own registry.
- Governance: Linux Foundation / CNCF.
- Compatibility: tracks Terraform core features where licensing and design allow; Stacks and Sentinel are Terraform-only.
Adopt OpenTofu when OSI licensing is non-negotiable or you want client-side state encryption today.
Terraform vs AWS CDK vs CloudFormation (quick matrix)
| Dimension | Terraform | AWS CDK | CloudFormation / SAM | Pulumi |
|---|---|---|---|---|
| Language | HCL | TS / Python / Java / .NET / Go | YAML / JSON | TS / Python / Go / .NET |
| Scope | Multi-cloud | AWS-only (CloudFormation under the hood) | AWS-only | Multi-cloud |
| State | External (S3 / HCP) | CloudFormation | CloudFormation | External |
| Policy engine | Sentinel / OPA | Custom Aspects | CloudFormation Guard | CrossGuard / OPA |
| Drift detection | terraform plan | CloudFormation Drift | CloudFormation Drift | pulumi preview |
| Ecosystem | Largest | Growing | Native AWS coverage | Growing |
| Good fit | Enterprise multi-cloud, HCP governance | AWS-only dev teams wanting TypeScript | Minimal-tooling AWS shops | Teams wanting CDK-style with multi-cloud |
Failure modes & resilience
1. AWS provider rate limits / throttling. hashicorp/aws calls the AWS API like any other client; large apply runs hit Throttling, RequestLimitExceeded, or TooManyRequestsException on Organizations, IAM, and Lambda APIs. Configure exponential backoff via max_retries and split monolithic state files:
provider "aws" {
region = var.aws_region
max_retries = 25 # default 25; bump only for known burst-heavy workloads
default_tags {
tags = local.common_tags
}
}
For Organizations / Control Tower management, throttle parallelism: terraform apply -parallelism=5 (default 10). Apply runs against AWS Organizations APIs are particularly sensitive.
2. Partial-apply rollback. Terraform does not roll back on partial failure — the failed resource is left as-is, and state reflects what succeeded. Recovery: read the error, fix the cause, re-run apply. For the high-stakes case where you cannot tolerate forward recovery, gate apply behind a taint/untaint workflow and add prevent_destroy lifecycle blocks on critical infra (RDS, KMS keys, S3 buckets with Object Lock).
3. State corruption recovery. S3 versioning is your safety net. Recovery flow: aws s3api list-object-versions --bucket <state-bucket> --prefix <key> → identify last-good version → aws s3api copy-object → re-acquire lock → terraform refresh. KMS-CMK rotation does NOT impact existing state reads (KMS retains old material for decrypt) but rotating to a new key alias requires re-encrypting state via terraform init -reconfigure.
4. Drift between console and state. Inevitable in regulated multi-team accounts. Run nightly terraform plan -detailed-exitcode in CI; non-zero exit means drift. Auto-import via terraform plan -generate-config-out= (1.5+) for known-safe drift; ticket the rest.
5. Provider major-version upgrades. hashicorp/aws v5→v6 changed default behavior on several resources (aws_s3_bucket_versioning, aws_lb defaults). Always test in a staging account; use ~> 6.0 to allow patch upgrades only.
6. Cross-region operations. aws_route53_record, ACM certs in us-east-1 for CloudFront, and aws_s3_bucket_replication_configuration need provider aliases. Forgetting an alias silently creates resources in the default region.
Multi-region rollout with Stacks
# stack.tfdeploy.hcl
deployment "us_east_1" {
inputs = { region = "us-east-1", weight = 100 }
}
deployment "eu_west_1" {
inputs = { region = "eu-west-1", weight = 100 }
depends_on = [deployment.us_east_1]
}
orchestrate "rolling" {
check {
condition = context.plan.changes.add < 50
reason = "Plan changed too many resources; manual review required."
}
}
Stacks deployments run sequentially with explicit depends_on; orchestration checks gate progression. For wave-based rollouts across many regions, group deployments into waves with manual approval gates between them.
Observability runbook
CI signals:
| Signal | Action |
|---|---|
Nightly terraform plan exit code 2 (drift) | Auto-create ticket; review before next deploy; terraform import if benign |
apply duration > 30 min | Split state; check for aws_route53_* deletes (slow), ASG drains, RDS modifications |
Throttling / RateExceeded in plan/apply logs | Reduce -parallelism, bump max_retries, split state, request limit increase |
HCP Terraform run failure rate > 5% | Investigate Sentinel policy regressions, provider version skew |
State lock held > 1 hr | terraform force-unlock <lock-id> only after confirming no active apply in CI logs |
Drift alarm wiring (CloudWatch Logs → Metric Filter → Alarm):
aws logs put-metric-filter \
--log-group-name "/aws/codebuild/terraform-drift" \
--filter-name terraform-drift-detected \
--filter-pattern '"Plan: " "to add" "to change" "to destroy"' \
--metric-transformations metricName=DriftCount,metricNamespace=Terraform,metricValue=1
Then alarm on Terraform/DriftCount > 0 and route to PagerDuty / Slack.
Common pitfalls (field-tested)
- Manual console changes alongside Terraform — run nightly
terraform planas drift detection; useterraform importto reconcile. - Committing state or
*.tfvars—.gitignorethem; store secrets in Secrets Manager, pull via ephemeral values. - Monolithic configurations — break into modules; giant single-state plans are slow and risky.
- Skipping plan review — require PR-level plan review; gate apply behind approval.
- Unpinned providers — pin
hashicorp/awswith a major-version constraint; upgrade deliberately. - Long-lived IAM access keys in CI — use OIDC federation instead.
When Terraform is NOT the best fit
- Pure AWS-only team already deep in CDK with CDK Pipelines and application-language colocation.
- Very small team, one environment, one workload — CloudFormation/SAM templates are lighter.
- Workloads that are 95% managed AWS services with no Kubernetes, no DNS-external, no SaaS providers — CDK or CloudFormation may be enough.
- Strict air-gapped environment with no provider registry access — evaluate Terraform Enterprise / OpenTofu + mirrored registry.
Related reading
Terraform AWS provider upgrade strategyTerraform state management on AWS: import, move, repairSafe Terraform apply workflows with approval gates on AWSMigrating from Terraform to OpenTofu on AWSTerraform vs AWS CDK: IaC decision guideAWS infrastructure drift detection with Terraform
Related services
Tools & Calculators
Self-serve calculators and assessments that pair with this integration.
AWS Architecture Review
Get a senior AWS review of your Terraform module design and account structure.
Related AWS Services
Consulting engagements that frequently pair with this integration.
AWS Well-Architected Review — Free Assessment
Free AWS Well-Architected Review from FactualMinds. Identify risks, compliance gaps, and optimization opportunities.
AWS DevOps Consulting
AWS DevOps consulting — CI/CD pipeline setup, infrastructure as code (SAM/CDK), and deployment automation.
AWS Application Modernization — From Legacy to Cloud-Native
AWS application modernization — legacy migration, microservices, containers. Expert consulting from FactualMinds.
Who typically runs this integration?
The roles that most often own or review this stack.
AWS Solutions for DevOps & Platform Engineers
EKS Auto Mode, OIDC-native CI/CD, supply-chain security, CDK Toolkit v2, and eBPF observability for platform teams building the platform on AWS in 2026.
AWS Solutions for CTOs
Cloud strategy, multi-account governance, agentic AI platform decisions, and FinOps culture for technology leaders scaling AWS in 2026 and beyond.
Related Integrations
Other AWS integration guides commonly deployed alongside this one.
GitHub Actions with AWS
GitHub Actions to AWS in 2026: OIDC keyless auth, Artifact Attestations, Immutable Actions, ARM runners, and reusable workflows to ECS, Lambda, EKS.
Kubernetes on AWS (EKS)
Amazon EKS in 2026: Auto Mode GA, Hybrid Nodes, Karpenter 1.0, Pod Identity, Graviton-first node pools, and ECR enhanced scanning — cheaper, safer K8s.
HashiCorp Vault on AWS
HashiCorp Vault on AWS: dynamic DB credentials, transit-engine encryption, HCP Vault Secrets, and EKS Secrets Operator vs AWS Secrets Manager guidance.
Frequently Asked Questions
What is Terraform Stacks and when should I adopt it?
What are ephemeral values and ephemeral resources?
What are provider-defined functions and what do they replace?
Is the Terraform Test Framework production-ready in 2026?
Should I stay on Terraform or move to OpenTofu in 2026?
What changed with the IBM + HashiCorp acquisition?
How should I organize Terraform state and AWS accounts in 2026?
Terraform vs AWS CDK vs CloudFormation in 2026 — which should I pick?
How do I handle secrets in Terraform on AWS?
Related Reading
- How to Upgrade the AWS Terraform Provider Safely: Strategy, Testing, and Rollback
Most teams are 2-3 major AWS provider versions behind. Old providers miss new AWS features, have security risks, and diverge from current best practices. This guide covers how to audit, upgrade, test, and rollback safely.
- Terraform State Management on AWS: Imports, State Moves, and Emergency Repairs
Terraform state is the source of truth for your infrastructure. When it breaks, your entire IaC strategy breaks with it. This guide covers state imports, moves, emergency repairs, and the backend best practices that prevent state disasters on AWS.
- How to Build a Safe Terraform Apply Workflow on AWS: Approval Gates, Plan Review, and Rollback
One bad `terraform apply` can delete your database, destroy your application load balancer, or lock your team out of AWS. This guide covers the approval gates, plan review processes, and safety tools that prevent infrastructure disasters.
- Migrate from Terraform to OpenTofu: What AWS Teams Need to Know
Terraform to OpenTofu migration: compatibility, risks, tools, and production deployment patterns for AWS infrastructure.
- Terraform vs AWS CDK: Infrastructure as Code Decision Guide
Terraform is the multi-cloud default. CDK ships AWS features the day they GA. Language support, state management, multi-cloud flexibility, and the trade-off that determines which IaC tool fits your team — plus when running both is the right answer.
- AWS Infrastructure Drift Detection: How to Find and Fix Config Drift Before It Breaks Production
Infrastructure drift—when your actual AWS resources differ from what your IaC declares—causes silent failures and makes disaster recovery impossible. Learn how to detect drift systematically and fix it before it breaks production.
Need Help with This Integration?
Our AWS-certified engineers can design, implement, and operate this integration end-to-end — or review what you already have.