AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

Multi-region AWS architectures can easily cost 2–3× a single-region equivalent when data replication, cross-region transfer, and duplicated managed services are not accounted for. Here is how to architect for resilience without proportional cost growth.

Key Facts

  • Multi-region AWS architectures can easily cost 2–3× a single-region equivalent when data replication, cross-region transfer, and duplicated managed services are not accounted for
  • Multi-region AWS architectures can easily cost 2–3× a single-region equivalent when data replication, cross-region transfer, and duplicated managed services are not accounted for

How to Design Multi-Region AWS Architectures Without Doubling Costs

Cloud Architecture Palaniappan P 12 min read

Quick summary: Multi-region AWS architectures can easily cost 2–3× a single-region equivalent when data replication, cross-region transfer, and duplicated managed services are not accounted for. Here is how to architect for resilience without proportional cost growth.

Key Takeaways

  • Multi-region AWS architectures can easily cost 2–3× a single-region equivalent when data replication, cross-region transfer, and duplicated managed services are not accounted for
  • Multi-region AWS architectures can easily cost 2–3× a single-region equivalent when data replication, cross-region transfer, and duplicated managed services are not accounted for
How to Design Multi-Region AWS Architectures Without Doubling Costs
Table of Contents

The decision to go multi-region is driven by one of two requirements: regulatory data residency (your contract says user data must stay in the EU) or resilience (you cannot afford a regional AWS outage to take you down). Both are legitimate. The cost mistake is treating multi-region as an all-or-nothing binary.

A full active-active multi-region architecture running identical stacks in us-east-1 and eu-west-1 costs nearly twice as much as a single-region deployment, plus cross-region replication charges. But most teams do not need active-active. They need something more targeted: their static assets globally cached, their database readable from multiple regions, and their compute able to start in a secondary region within 15 minutes of a declared incident.

This post walks through the cost model for each multi-region pattern, gives exact numbers for Aurora Global Database and S3 cross-region replication, and shows how to architect for meaningful resilience at a fraction of active-active cost.

The Active-Active vs Active-Passive Cost Gap

Let us start with concrete numbers. A representative mid-size application stack in a single region:

ComponentConfigurationMonthly Cost
ECS Fargate4 tasks, 1 vCPU / 2 GB$157
Aurora MySQLdb.r6g.large, Multi-AZ$390
ElastiCachecache.r6g.large$183
ALB~1M requests/day$55
NAT Gateway~100 GB/month$49
Single Region Total~$834/month

Active-Active: Full Duplication

Active-active requires full production capacity in each region. You serve traffic from both regions simultaneously, and either region can handle full load if the other fails.

  • Second region compute (same as primary): +$834/month
  • Aurora Global Database replication overhead: +$85–100/month (storage, write I/O, data transfer — see section below)
  • S3 CRR for user uploads: depends on data volume (see section below)
  • Route 53 latency routing: minimal query charges
  • Active-active total: ~$1,750–1,770/month (2.1× single region)

The hidden active-active cost: your engineering team must design every write operation to handle cross-region conflict resolution or route all writes to a primary region (eliminating the active-active benefit for write-heavy workloads).

Active-Passive: Warm Standby

Active-passive runs full capacity in the primary region and a scaled-down warm standby in the secondary. Traffic only flows to the secondary when the primary fails. Failover is not instant — it requires scaling up compute, DNS propagation (60–300 seconds), and potentially promoting the Aurora Global Database secondary.

Secondary region in active-passive:

  • ECS Fargate minimum (1 task per service for warm standby): +$39/month
  • Aurora Global Database secondary: storage + write I/O replication (same as active-active): +$85–100/month
  • No ALB in standby (create on failover): $0
  • No NAT if secondary VPC is minimal: $0
  • Active-passive total: ~$960–980/month (1.15× single region)

The $1,750 vs $980 difference buys you instantaneous failover with zero RTO (active-active) vs 3–10 minutes of recovery time (active-passive). For most applications, the 3–10 minute RTO is acceptable and the $770/month saving is better spent elsewhere.

The Multi-Region Data Transfer Trap

Data transfer costs are where multi-region architectures surprise teams. AWS charges for data that crosses region boundaries in several ways that are easy to miss at planning time.

Aurora Global Database: Full Cost Breakdown

Aurora Global Database replicates your primary cluster to up to 5 secondary regions with sub-second replication lag. The cost model has three components beyond the secondary cluster instance cost:

Storage replication: Aurora charges $0.20/GB/month for storage in the primary region. In each secondary region, the same storage cost applies for replicated data. A 100 GB database: $20/month for primary storage, $20/month for secondary storage. This doubles your Aurora storage cost regardless of instance size.

Write I/O replication: Aurora charges $0.20 per million write I/O operations. With Global Database, write I/Os in the primary region are replicated to secondary regions and charged again. 10 million write I/Os/day in the primary: $60/month at the primary, then $60/month again for the replicated write I/Os at the secondary. This doubles your write I/O cost.

Cross-region data transfer: $0.02/GB for data transferred between regions for Aurora Global Database replication. For a write-heavy application generating 5 GB of change data per day: 5 × 30 × $0.02 = $3/month. For 50 GB/day: $30/month. This is usually the smallest of the three components.

The total Aurora Global Database overhead for a 100 GB database with 10M write I/Os/day: approximately $85–100/month beyond the secondary cluster instance cost. Know this number before you commit.

S3 Cross-Region Replication Cost Control

S3 Cross-Region Replication (CRR) charges $0.015/1,000 objects for replication PUT requests plus $0.02/GB (or $0.09/GB for internet transfer, but CRR uses AWS backbone at $0.02/GB). For a user-upload bucket replicating 1 TB/month: $20/month in transfer costs plus replication PUT costs.

The prefix filter is the lever for controlling CRR costs. Not everything in S3 needs to be replicated cross-region. User profile images need replication (served globally). Temporary processing files do not. Raw video uploads before transcoding do not — only the transcoded output needs replication.

DynamoDB Global Tables: The Most Expensive Replication

DynamoDB Global Tables replicate every write to every configured region. The cost model: you pay standard DynamoDB pricing in each region, plus a replication write cost equal to the standard write cost per replicated region. A table doing 1 million writes/day at $0.00065/WCU: $650/month in the primary region. Adding a second region for Global Tables: $650/month additional for replication WCUs + $650/month for the second region’s standard writes = effectively the single-region DynamoDB cost.

DynamoDB Global Tables is appropriate for globally distributed write-heavy applications. For read-heavy applications with occasional writes, Aurora Global Database (which only replicates from primary to secondary, not the reverse) is substantially cheaper.

Route 53 Routing Strategies and Costs

Route 53 routing is cheap. Do not let this lead you to over-engineer it — the cost is not the limiting factor.

Failover Routing with Health Checks (Terraform)

resource "aws_route53_health_check" "primary" {
  fqdn              = "app.primary.example.com"
  port              = 443
  type              = "HTTPS"
  resource_path     = "/health"
  failure_threshold = 3
  request_interval  = 30

  tags = {
    Name = "primary-region-health-check"
  }
}

resource "aws_route53_health_check" "secondary" {
  fqdn              = "app.secondary.example.com"
  port              = 443
  type              = "HTTPS"
  resource_path     = "/health"
  failure_threshold = 3
  request_interval  = 30

  tags = {
    Name = "secondary-region-health-check"
  }
}

resource "aws_route53_record" "primary" {
  zone_id = var.hosted_zone_id
  name    = "app.example.com"
  type    = "A"

  failover_routing_policy {
    type = "PRIMARY"
  }

  set_identifier  = "primary"
  health_check_id = aws_route53_health_check.primary.id

  alias {
    name                   = var.primary_alb_dns_name
    zone_id                = var.primary_alb_zone_id
    evaluate_target_health = true
  }
}

resource "aws_route53_record" "secondary" {
  zone_id = var.hosted_zone_id
  name    = "app.example.com"
  type    = "A"

  failover_routing_policy {
    type = "SECONDARY"
  }

  set_identifier  = "secondary"
  health_check_id = aws_route53_health_check.secondary.id

  alias {
    name                   = var.secondary_alb_dns_name
    zone_id                = var.secondary_alb_zone_id
    evaluate_target_health = true
  }
}

This configuration responds to the health check at /health every 30 seconds from 3 AWS locations. Three consecutive failures trigger failover. With a 30-second check interval and 3-failure threshold, the maximum time to detect failure and begin routing to the secondary is 90 seconds, plus DNS TTL propagation.

Set your DNS TTL to 60 seconds for records that participate in failover routing. The default 300 seconds means 5 minutes of continued routing to a failed endpoint after Route 53 detects failure.

Latency-Based Routing vs Geolocation

Latency-based routing sends users to the lowest-latency region automatically, requires no configuration per user location, and adds no cost beyond query charges. Geolocation routing is appropriate for regulatory compliance (EU users must go to eu-west-1) but requires maintaining routing rules per country or continent.

For most applications, latency-based routing is the correct choice for multi-region active-active setups. It routes users to their closest region automatically as you add regions.

Aurora Global Database with Secondary Region (Terraform)

resource "aws_rds_global_cluster" "main" {
  global_cluster_identifier = var.cluster_name
  engine                    = "aurora-mysql"
  engine_version            = "8.0.mysql_aurora.3.04.0"
  database_name             = var.database_name
  storage_encrypted         = true
}

# Primary cluster (us-east-1)
resource "aws_rds_cluster" "primary" {
  provider = aws.primary

  cluster_identifier        = "${var.cluster_name}-primary"
  engine                    = aws_rds_global_cluster.main.engine
  engine_version            = aws_rds_global_cluster.main.engine_version
  global_cluster_identifier = aws_rds_global_cluster.main.id
  database_name             = var.database_name
  master_username           = var.master_username
  master_password           = var.master_password

  db_subnet_group_name   = aws_db_subnet_group.primary.name
  vpc_security_group_ids = [aws_security_group.aurora_primary.id]

  backup_retention_period = 7
  skip_final_snapshot     = false

  tags = var.tags
}

resource "aws_rds_cluster_instance" "primary" {
  provider = aws.primary

  count              = 2
  identifier         = "${var.cluster_name}-primary-${count.index}"
  cluster_identifier = aws_rds_cluster.primary.id
  instance_class     = var.primary_instance_class
  engine             = aws_rds_cluster.primary.engine
  engine_version     = aws_rds_cluster.primary.engine_version

  tags = var.tags
}

# Secondary cluster (eu-west-1) — warm standby
resource "aws_rds_cluster" "secondary" {
  provider = aws.secondary

  cluster_identifier        = "${var.cluster_name}-secondary"
  engine                    = aws_rds_global_cluster.main.engine
  engine_version            = aws_rds_global_cluster.main.engine_version
  global_cluster_identifier = aws_rds_global_cluster.main.id

  db_subnet_group_name   = aws_db_subnet_group.secondary.name
  vpc_security_group_ids = [aws_security_group.aurora_secondary.id]

  # Secondary clusters cannot have master credentials — they replicate from primary
  skip_final_snapshot = false

  depends_on = [aws_rds_cluster_instance.primary]

  tags = var.tags
}

# Smaller instance for warm standby — scale up during failover
resource "aws_rds_cluster_instance" "secondary" {
  provider = aws.secondary

  count              = 1  # One instance for warm standby vs 2 in primary
  identifier         = "${var.cluster_name}-secondary-${count.index}"
  cluster_identifier = aws_rds_cluster.secondary.id
  instance_class     = var.secondary_instance_class  # Can be smaller than primary
  engine             = aws_rds_cluster.secondary.engine
  engine_version     = aws_rds_cluster.secondary.engine_version

  tags = var.tags
}

Running a db.r6g.medium in the secondary vs db.r6g.large in the primary saves approximately $150/month for the warm standby. The secondary can serve read traffic for local users, partially justifying its cost.

S3 Cross-Region Replication with Prefix Filter

Prefix filtering reduces CRR costs by replicating only objects that need geographic redundancy:

resource "aws_s3_bucket_replication_configuration" "user_assets" {
  role   = aws_iam_role.replication.arn
  bucket = aws_s3_bucket.primary.id

  rule {
    id     = "replicate-profile-images"
    status = "Enabled"

    filter {
      prefix = "profile-images/"
    }

    destination {
      bucket        = aws_s3_bucket.secondary.arn
      storage_class = "STANDARD_IA"  # Lower cost for secondary region

      replication_time {
        status = "Enabled"
        time {
          minutes = 15
        }
      }

      metrics {
        status = "Enabled"
        event_threshold {
          minutes = 15
        }
      }
    }

    delete_marker_replication {
      status = "Enabled"
    }
  }

  rule {
    id     = "replicate-documents"
    status = "Enabled"

    filter {
      prefix = "documents/"
    }

    destination {
      bucket        = aws_s3_bucket.secondary.arn
      storage_class = "STANDARD_IA"
    }

    delete_marker_replication {
      status = "Enabled"
    }
  }

  # Explicitly DO NOT replicate: temp/, processing/, logs/
  # Those prefixes are not included in any rule, so they are not replicated
}

Using STANDARD_IA in the secondary region saves ~40% on storage costs for replicated objects. The secondary copy is for disaster recovery, not active serving — access is infrequent by design, making STANDARD_IA appropriate. The $0.01/GB retrieval cost in STANDARD_IA is acceptable for DR scenarios.

Partial Multi-Region: The 80% of Resilience at 20% of the Cost

Most teams do not need fully symmetric multi-region architecture. They need their application to survive a regional outage with acceptable downtime (15–60 minutes) at a fraction of active-active cost.

Pattern 1: Static Assets in Second Region Only

If your application serves a global audience, put your static assets (images, CSS, JavaScript, videos) in CloudFront with S3 origins in two regions. Use CloudFront Origin Groups with failover routing between primary and secondary S3 buckets. The compute and database stay in one region.

Cost: S3 storage in the secondary region for static assets (potentially $5–20/month), CloudFront distribution costs (already being paid), and S3 CRR transfer for assets only.

This pattern dramatically improves performance for global users and provides content availability even during compute region failures, for a fraction of full multi-region cost.

Pattern 2: Read Replicas for Database Resilience

Aurora supports cross-region read replicas outside of Global Database. A read replica in a secondary region provides:

  • Read traffic offloading for global users
  • A promotion path during disaster recovery (15–20 minute RTO to promote to primary)
  • Lower cost than Aurora Global Database (read replicas use standard replication, not Global Database replication pricing)

Cross-region read replica costs: $0.20/GB/month for replicated storage (same as Global Database), but write I/O replication is not charged separately for read replicas — the replication data is included in standard network transfer pricing ($0.02/GB). For write-light databases, read replicas are cheaper than Global Database while providing the same DR capability.

Pattern 3: Lambda@Edge for Lightweight Global Logic

For API endpoints that need global low-latency, Lambda@Edge runs at CloudFront edge locations without a multi-region VPC/ECS setup. Functions run at the edge closest to the user and can make requests back to a single-region origin. Not appropriate for database-heavy operations, but ideal for auth token validation, A/B testing, request transformation, and caching logic.

Lambda@Edge pricing: $0.60/million requests + $0.00005001/GB-second. Far cheaper than running compute in multiple regions.

Failover Testing with AWS FIS

Testing failover without triggering actual cross-region data transfer costs uses Route 53 health check manipulation:

  1. Update the primary health check to check a URL that temporarily returns HTTP 500 (add a maintenance flag to your health endpoint)
  2. Observe Route 53 detect the failure and failover to the secondary
  3. Validate secondary endpoint responds correctly
  4. Remove the maintenance flag
  5. Observe Route 53 detect recovery and restore primary routing

This tests the DNS failover path and health check detection without scaling up secondary compute or triggering cross-region data replication. It costs essentially nothing.

For full failover drills that validate the complete secondary stack (compute scale-up, Aurora promotion, data integrity): schedule quarterly, budget $200–500 for the scale-up period and cross-region traffic.

Edge Cases and Failure Patterns

Split-Brain in Active-Active

If both regions become healthy simultaneously after a network partition (each region thought it was the survivor and accepted writes), Aurora Global Database prevents true split-brain at the database level — only the primary accepts writes, and the secondary is read-only. Application-level split-brain (two instances of a background job running simultaneously) is a separate concern and requires distributed locking (via DynamoDB or ElastiCache) that is itself replicated.

Replication Lag and Stale Reads

Aurora Global Database typically achieves under 1 second replication lag. During high write throughput periods, lag can temporarily increase. If your application reads from the secondary immediately after a write, it may read stale data. Mitigations: route write-heavy sessions to the primary endpoint, use session consistency (always read from primary for the same session), or design the application to tolerate eventual consistency.

Health Check False Positives

A health check that tests only TCP connectivity will return healthy even when your application is returning 500 errors. Test the application health endpoint, not just the port. Include a lightweight database ping in your health endpoint (ensure the path to the database is healthy), but not a full integration test (health check latency should be under 200ms).

Consider a calculated health check in Route 53 that requires N of M individual health checks to be healthy before marking the record as healthy. This prevents a single-AZ failure (which AWS manages automatically via Multi-AZ) from triggering a cross-region failover.

Making the Decision

The question is not “should we go multi-region?” but “what components need multi-region coverage and what RTO/RPO does each require?”

A practical framework:

  • Static assets: CloudFront + S3 multi-region is always justified (improves performance, not just DR)
  • Database: Aurora Global Database with one secondary adds $85–200/month depending on data volume; justifiable for RPO < 5 minutes
  • Compute: Active-passive warm standby adds $40–100/month in a minimal secondary; justifiable for RTO < 30 minutes
  • Full active-active compute: justified only for RTO < 2 minutes, at roughly 1.5–2× total infrastructure cost

For more on AWS resilience patterns and DR strategies, see our guide on AWS disaster recovery strategies. For the cross-region data transfer costs that apply beyond replication, see AWS data transfer costs for startups. For a comprehensive cost governance framework, see our AWS cost control architecture optimization playbook.

PP
Palaniappan P

AWS Cloud Architect & AI Expert

AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.

AWS ArchitectureCloud MigrationGenAI on AWSCost OptimizationDevOps

Ready to discuss your AWS strategy?

Our certified architects can help you implement these solutions.

Recommended Reading

Explore All Articles »
How to Migrate to AWS Without Cost Surprises

How to Migrate to AWS Without Cost Surprises

AWS migration cost estimates are consistently wrong — not because the tools are bad, but because they miss the parallel run period, data transfer during migration, and the operational tax of learning a new environment. Here is what to actually model.