AWS Route 53: DNS and Traffic Management Patterns
Quick summary: A practical guide to AWS Route 53 — hosted zones, routing policies, health checks, DNS failover, domain registration, and the traffic management patterns that make applications highly available.
Key Takeaways
- A practical guide to AWS Route 53 — hosted zones, routing policies, health checks, DNS failover, domain registration, and the traffic management patterns that make applications highly available
- A practical guide to AWS Route 53 — hosted zones, routing policies, health checks, DNS failover, domain registration, and the traffic management patterns that make applications highly available

Table of Contents
Route 53 is AWS’s DNS service — it translates domain names into IP addresses that computers use to connect. But Route 53 is more than a DNS host. It provides health checking, traffic routing, failover, and geolocation-based routing that make it a critical component of high-availability architectures.
DNS is also one of the services where mistakes are most visible and most disruptive. A misconfigured DNS record takes down your entire application — not one server, not one Region, everything. This guide covers the Route 53 patterns that keep production applications available and performant.
Core Concepts
Hosted Zones
A hosted zone is a container for DNS records for a domain:
Public hosted zone — Resolves domain names on the internet:
example.com (public hosted zone)
├── A example.com → ALB (dualstack.my-alb.us-east-1.elb.amazonaws.com)
├── CNAME www.example.com → example.com
├── MX example.com → 10 mail.example.com
└── TXT example.com → "v=spf1 include:amazonses.com ~all"Private hosted zone — Resolves domain names within a VPC only:
internal.example.com (private hosted zone, attached to VPC)
├── A api.internal.example.com → 10.0.11.50
├── A db.internal.example.com → 10.0.21.30
└── CNAME cache.internal.example.com → my-redis.abc123.use1.cache.amazonaws.comPrivate hosted zones enable human-readable names for internal services without exposing them to the internet. Attach the hosted zone to each VPC that needs to resolve these names.
Record Types
| Type | Purpose | Example |
|---|---|---|
| A | Maps domain to IPv4 address | example.com → 192.0.2.1 |
| AAAA | Maps domain to IPv6 address | example.com → 2001:db8::1 |
| CNAME | Maps domain to another domain | www.example.com → example.com |
| Alias | Maps domain to AWS resource (Route 53 specific) | example.com → ALB DNS name |
| MX | Mail server routing | example.com → 10 mail.example.com |
| TXT | Text records (SPF, DKIM, verification) | "v=spf1 include:amazonses.com ~all" |
| NS | Name server delegation | example.com → ns-123.awsdns-45.com |
| SRV | Service location | _sip._tcp.example.com → 10 60 5060 sip.example.com |
Alias Records
Alias records are Route 53’s most important feature — they map your domain directly to AWS resources without CNAME limitations:
CNAME limitation: Cannot be used at the zone apex (example.com)
Alias advantage: Works at the zone apex AND is free (no query charges)Use Alias records for:
- ALB/NLB/CLB endpoints
- CloudFront distributions
- S3 website endpoints
- API Gateway endpoints
- Another Route 53 record in the same hosted zone
Alias queries are free. CNAME queries are charged. Always use Alias when pointing to an AWS resource.
Routing Policies
Routing policies determine how Route 53 responds to DNS queries. Each policy serves a different traffic management pattern.
Simple Routing
One record, one or more values. Route 53 returns all values in random order:
example.com → [192.0.2.1, 192.0.2.2, 192.0.2.3]
Client receives all IPs, connects to one (typically the first)Best for: Single resources or simple round-robin distribution. No health checks — if one IP is unhealthy, clients may still receive it.
Weighted Routing
Distribute traffic by percentage across multiple resources:
example.com:
Record 1: ALB-primary weight=90 (90% of traffic)
Record 2: ALB-canary weight=10 (10% of traffic)Use cases:
- Canary deployments — Send 5-10% of traffic to the new version, monitor, then shift 100%
- A/B testing — Split traffic between application variants
- Blue-green migrations — Gradually shift traffic from old to new infrastructure
- Regional load distribution — Send traffic to Regions proportionally to capacity
Latency-Based Routing
Route traffic to the Region with the lowest latency for the user:
example.com:
Record 1: ALB-us-east-1 (Region: us-east-1)
Record 2: ALB-eu-west-1 (Region: eu-west-1)
Record 3: ALB-ap-southeast-1 (Region: ap-southeast-1)A user in London gets routed to eu-west-1. A user in Tokyo gets routed to ap-southeast-1. Route 53 measures latency between the user’s DNS resolver location and each AWS Region.
Best for: Multi-Region applications where user experience depends on latency. Combine with health checks for automatic failover when a Region is unhealthy.
Geolocation Routing
Route traffic based on the geographic location of the user:
example.com:
Record 1: ALB-eu-west-1 (Location: Europe)
Record 2: ALB-us-east-1 (Location: North America)
Record 3: ALB-ap-northeast-1 (Location: Asia)
Record 4: ALB-us-east-1 (Location: Default — catches all others)Use cases:
- Compliance — Keep EU user data in EU Regions (GDPR)
- Content localization — Serve language-specific content by geography
- License restrictions — Restrict service availability by country
- Regulatory requirements — Different processing rules per jurisdiction
Always include a default record. Without it, users outside your defined geographies receive no response.
Failover Routing
Active-passive failover between primary and secondary resources:
example.com:
Primary: ALB-us-east-1 (health check: /health → 200 OK)
Secondary: ALB-us-west-2 (used only when primary fails health check)Route 53 monitors the primary with health checks. When the primary fails, Route 53 automatically routes all traffic to the secondary. When the primary recovers, traffic routes back.
Best for: Disaster recovery with active-passive architecture. The secondary can be a full standby, a static S3 error page, or a different Region deployment.
Multivalue Answer Routing
Return multiple healthy values with health checks:
example.com:
Record 1: 192.0.2.1 (health check: healthy) ← returned
Record 2: 192.0.2.2 (health check: healthy) ← returned
Record 3: 192.0.2.3 (health check: unhealthy) ← NOT returned
Record 4: 192.0.2.4 (health check: healthy) ← returnedSimilar to simple routing but with health checks — unhealthy resources are removed from responses. Returns up to 8 healthy records.
Best for: Simple health-checked round-robin without the complexity of an ALB.
Health Checks
Health checks are the mechanism that makes routing policies intelligent — without them, Route 53 routes to dead endpoints.
Health Check Types
| Type | Monitors | Best For |
|---|---|---|
| Endpoint | HTTP/HTTPS/TCP response | ALBs, APIs, web applications |
| Calculated | Combination of other health checks | Complex health logic (3 of 5 healthy) |
| CloudWatch alarm | CloudWatch metric alarm state | Custom health based on any metric |
Endpoint Health Checks
Health check configuration:
Protocol: HTTPS
Endpoint: example.com/health
Port: 443
Path: /health
Interval: 30 seconds (or 10 seconds for fast detection)
Failure threshold: 3 (mark unhealthy after 3 consecutive failures)
Regions: Route 53 checks from multiple Regions worldwideHealth check endpoint design:
- Return
200 OKonly when the application is fully operational - Check database connectivity, cache availability, and critical dependencies
- Respond within 4 seconds (Route 53’s timeout for standard health checks)
- Do not cache health check responses
Calculated Health Checks
Combine multiple health checks with AND/OR logic:
Application health = API health check AND Database health check AND Cache health check
→ All three must be healthy for the application to be considered healthy
→ If any one fails, failover routing activates
OR variant:
Application health = API-AZ-a OR API-AZ-b OR API-AZ-c
→ Application is healthy if at least 1 of 3 AZs is healthyHealth Check + Alarm Integration
Use CloudWatch alarms for health metrics that are not HTTP-based:
CloudWatch alarm: SQS queue depth > 10,000 (consumer falling behind)
→ Alarm state: ALARM
→ Health check: Unhealthy
→ Route 53: Failover to secondary RegionThis enables failover based on any CloudWatch metric — not just HTTP response codes.
Architecture Patterns
Multi-Region Active-Active
Serve traffic from multiple Regions simultaneously:
example.com (latency-based routing)
├── us-east-1: ALB → ECS Service (health check: /health)
├── eu-west-1: ALB → ECS Service (health check: /health)
└── ap-southeast-1: ALB → ECS Service (health check: /health)Users are routed to the nearest healthy Region. If a Region fails its health check, traffic automatically redistributes to the remaining Regions.
Requirements:
- Stateless application tier (no local session state)
- Global database (DynamoDB Global Tables or Aurora Global Database)
- Cross-Region data replication
Multi-Region Active-Passive (DR)
Primary Region handles all traffic; secondary activates during failure:
example.com (failover routing)
├── Primary: us-east-1 ALB (health check: /health)
└── Secondary: us-west-2 ALB (activated on primary failure)Cost advantage: The secondary Region can run at reduced capacity (pilot light or warm standby) until failover, significantly reducing DR costs compared to active-active.
Blue-Green Deployment
Use weighted routing for zero-downtime deployments:
Step 1: example.com → Blue (weight=100), Green (weight=0)
Step 2: Deploy new version to Green
Step 3: example.com → Blue (weight=90), Green (weight=10) ← canary
Step 4: Monitor Green for errors
Step 5: example.com → Blue (weight=0), Green (weight=100) ← cutoverIf Green shows errors at Step 4, revert: set Green weight back to 0. DNS propagation delay (TTL) means some traffic continues to the old weights for the TTL duration.
TTL for blue-green: Set record TTL to 60 seconds during deployments. Lower TTLs mean faster cutover but more DNS queries (and cost).
CloudFront + Route 53
For global content delivery:
example.com (Alias → CloudFront distribution)
→ CloudFront edge locations worldwide
→ Origin: ALB in us-east-1CloudFront handles geographic distribution at the CDN layer. Route 53 provides the DNS resolution. Combined with WAF for security and S3 for static assets, this is the standard architecture for global web applications.
Domain Management
Domain Registration
Route 53 is also a domain registrar. Registering domains through Route 53 simplifies management — DNS hosted zone is created automatically.
Domain transfer to Route 53:
- Unlock the domain at the current registrar
- Get the authorization code (EPP code)
- Initiate transfer in Route 53 console
- Confirm transfer via email
- Transfer completes in 5-7 days
DNSSEC
DNSSEC (DNS Security Extensions) protects against DNS spoofing by digitally signing DNS records:
Route 53 DNSSEC:
1. Enable DNSSEC signing in the hosted zone
2. Route 53 creates a KMS key for signing
3. Add DS record to the parent zone (registrar)
4. Route 53 signs all records automaticallyEnable DNSSEC for: Financial applications, healthcare, any domain where DNS spoofing could redirect users to malicious sites.
Cost Optimization
Pricing
| Component | Cost |
|---|---|
| Hosted zone | $0.50/month |
| Standard queries | $0.40/million |
| Latency-based queries | $0.60/million |
| Geo queries | $0.70/million |
| Alias queries (to AWS resources) | Free |
| Health checks (basic) | $0.50/month |
| Health checks (with string matching) | $0.75/month |
| Health checks (fast interval, 10 sec) | $1.00/month |
Cost Reduction
- Use Alias records for AWS resources — Alias queries are free, CNAME queries are not
- Increase TTL for stable records — Higher TTL = fewer queries = lower cost. Production records: 300 seconds (5 min). Stable records: 3600 seconds (1 hour)
- Consolidate hosted zones — Multiple domains pointing to the same infrastructure can share a hosted zone using subdomains
- Use basic health checks — Fast interval (10 sec) health checks cost 2x. Use standard interval (30 sec) unless your failover SLA requires faster detection
Monitoring
Set CloudWatch alarms for DNS health:
| Metric | Alarm Condition | Indicates |
|---|---|---|
HealthCheckStatus | 0 (unhealthy) | Endpoint failure, potential failover |
HealthCheckPercentageHealthy | < 100% | Partial failure across health check Regions |
DNSQueries | Sudden spike or drop | Traffic anomaly or DNS issue |
ConnectionTime | > 4 seconds | Endpoint responding slowly |
Query Logging
Enable Route 53 query logging to CloudWatch Logs for:
- Traffic analysis — Understand DNS query patterns and volume
- Security monitoring — Detect unusual query patterns or domains
- Debugging — Verify which records are being returned to clients
- Compliance — Audit trail of DNS resolution
Common Mistakes
Mistake 1: No Health Checks on Failover Records
Failover routing without health checks does not fail over — Route 53 always returns the primary record because it does not know the primary is unhealthy. Every failover and latency-based record must have an associated health check.
Mistake 2: TTL Too High During Deployments
Default TTL of 3600 seconds (1 hour) means DNS changes take up to 1 hour to propagate. Before a deployment or migration, reduce TTL to 60 seconds 24 hours in advance. After the change is stable, increase TTL back.
Mistake 3: Missing Default Geolocation Record
Geolocation routing without a default record means users outside your defined geographies get no DNS response — effectively an outage for those users. Always include a default record.
Mistake 4: CNAME at Zone Apex
Using CNAME for the root domain (example.com) violates the DNS specification and breaks other records (MX, TXT). Use Alias records instead — they work at the zone apex and are free.
Mistake 5: No Monitoring on Health Checks
Health checks transition to unhealthy and trigger failover, but no one is notified. Set CloudWatch alarms on HealthCheckStatus for every health check so the team knows when failover occurs.
Getting Started
Route 53 is the entry point for all traffic to your application. Combined with CloudFront for content delivery, ALBs for load balancing, and multi-Region architectures for disaster recovery, it provides the DNS and traffic management layer that production applications require.
For DNS architecture design, multi-Region traffic management, and infrastructure review, talk to our team.



