AWS VPC Networking Best Practices for Production
Quick summary: A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization.
Key Takeaways
- A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization
- A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization

Table of Contents
Networking is the foundation that every other AWS service runs on. A well-designed VPC provides security isolation, predictable routing, and the flexibility to grow without re-architecting. A poorly designed VPC leads to overlapping IP ranges that prevent connectivity, public-facing resources that should be private, and networking costs that grow faster than the workloads they support.
Most networking mistakes are made during initial setup and are expensive to fix later. This guide covers the decisions that matter — CIDR planning, subnet strategy, connectivity patterns, and cost optimization — so you get the network right the first time.
VPC Design
CIDR Planning
CIDR (Classless Inter-Domain Routing) defines your VPC’s IP address range. Getting this right is critical because VPC CIDRs cannot overlap if you need to connect VPCs together, and expanding a VPC CIDR after creation has limitations.
Recommended CIDR ranges:
| Environment | CIDR | Usable IPs | Rationale |
|---|---|---|---|
| Production | 10.0.0.0/16 | 65,536 | Room for growth, many subnets |
| Staging | 10.1.0.0/16 | 65,536 | Mirrors production for testing |
| Development | 10.2.0.0/16 | 65,536 | Developer workloads |
| Shared Services | 10.10.0.0/16 | 65,536 | CI/CD, DNS, shared tools |
Rules:
- Use /16 for production VPCs — smaller ranges limit future growth
- Never overlap CIDRs across VPCs that might need to communicate
- Reserve ranges for on-premises connectivity (avoid 10.0.0.0/8 if your data center uses it)
- Document your CIDR allocation in a central registry
In a multi-account organization, plan CIDRs centrally to prevent overlaps across accounts.
Subnet Strategy
Three-tier architecture:
VPC: 10.0.0.0/16
├── Public Subnets (internet-facing)
│ ├── 10.0.1.0/24 (AZ-a) — ALB, NAT Gateway, bastion hosts
│ ├── 10.0.2.0/24 (AZ-b)
│ └── 10.0.3.0/24 (AZ-c)
├── Private Subnets (application tier)
│ ├── 10.0.11.0/24 (AZ-a) — ECS tasks, Lambda, EC2 instances
│ ├── 10.0.12.0/24 (AZ-b)
│ └── 10.0.13.0/24 (AZ-c)
└── Data Subnets (database tier)
├── 10.0.21.0/24 (AZ-a) — RDS, ElastiCache, OpenSearch
├── 10.0.22.0/24 (AZ-b)
└── 10.0.23.0/24 (AZ-c)Three Availability Zones — Always deploy across at least 2 AZs for high availability. Three AZs provide better fault tolerance and are required for some services (e.g., Amazon MSK, Aurora Multi-AZ with 2 readers).
Public subnets have a route to the Internet Gateway. Only resources that must receive inbound traffic from the internet should be here — ALBs, NAT Gateways, and (rarely) bastion hosts.
Private subnets have a route to a NAT Gateway (for outbound internet access) but no inbound internet route. Application workloads live here.
Data subnets have no internet access at all — no NAT Gateway route. Databases should never initiate outbound internet connections. AWS service access uses VPC endpoints.
Security Groups vs NACLs
Security Groups — Stateful, instance-level firewall. The primary access control mechanism:
| Rule | Source | Port | Purpose |
|---|---|---|---|
| ALB → Application | ALB security group | 8080 | Application traffic |
| Application → Database | Application security group | 5432 | Database queries |
| Application → Redis | Application security group | 6379 | Cache access |
Best practice: Reference security groups by ID (not CIDR) whenever possible. sg-abc123 is self-documenting and adapts automatically when instances are added or removed.
NACLs (Network ACLs) — Stateless, subnet-level firewall. Use as a secondary defense layer:
- Default NACLs allow all traffic — do not rely on them for security
- Custom NACLs block known malicious IP ranges or restrict traffic between subnet tiers
- NACLs require explicit allow rules for both inbound and outbound (stateless)
Recommendation: Use security groups as the primary control. Add NACLs only for specific requirements (blocking IP ranges, enforcing subnet-level restrictions for compliance).
Internet Connectivity
NAT Gateways
NAT Gateways provide outbound internet access for private subnet resources (package updates, API calls, SaaS integrations):
Cost:
- $0.045/hour per NAT Gateway = $32.40/month
- $0.045/GB data processed
High availability: Deploy one NAT Gateway per AZ. If AZ-a’s NAT Gateway fails, resources in AZ-a lose internet access — but resources in AZ-b and AZ-c continue normally.
Cost optimization:
- A single NAT Gateway in one AZ works for development environments ($32/month vs $97/month for three)
- Use VPC endpoints for AWS service traffic to reduce NAT Gateway data processing charges
- S3 and DynamoDB Gateway endpoints are free — always deploy them
VPC Endpoints
VPC endpoints provide private connectivity to AWS services without traversing the internet or NAT Gateway:
Gateway endpoints (free):
- S3
- DynamoDB
Always deploy these. They are free and reduce NAT Gateway costs.
Interface endpoints ($0.01/hour per AZ + $0.01/GB):
- ECR (for container image pulls)
- CloudWatch Logs (for log shipping)
- STS (for IAM role assumption)
- Secrets Manager / SSM Parameter Store
- KMS
- SQS, SNS, EventBridge
- Lambda, Step Functions
Cost analysis: An interface endpoint in 3 AZs costs ~$21.60/month. If your workload processes more than 480 GB/month through NAT Gateway to reach that service, the endpoint is cheaper. For high-traffic services (ECR, CloudWatch Logs), endpoints almost always save money.
Security benefit: VPC endpoints keep traffic within the AWS network. Data never traverses the public internet, reducing the attack surface.
Multi-VPC Connectivity
VPC Peering
Point-to-point connectivity between two VPCs:
VPC A ←→ VPC B (peering connection)Advantages: Simple, no additional cost (data transfer charges only), low latency.
Limitations: Not transitive (VPC A ↔ B and VPC B ↔ C does not mean VPC A ↔ C). For more than 3-4 VPCs, peering creates an unmanageable mesh.
Best for: Connecting 2-3 VPCs in simple architectures.
Transit Gateway
Hub-and-spoke connectivity for multiple VPCs and on-premises networks:
VPC A ───┐
VPC B ───┤
VPC C ───┼─── Transit Gateway ─── On-premises (VPN / Direct Connect)
VPC D ───┤
VPC E ───┘Advantages:
- Centralized routing — one hub connects all VPCs
- Transitive routing — any VPC can reach any other VPC through the hub
- VPN and Direct Connect integration
- Route tables for segmentation (production VPCs cannot reach development VPCs)
- Cross-Region peering for multi-Region architectures
Cost: $0.05/hour per attachment + $0.02/GB data processed. A Transit Gateway with 5 VPC attachments costs ~$180/month before data transfer.
Best for: Multi-account organizations with 4+ VPCs, hybrid connectivity, or network segmentation requirements.
PrivateLink
Expose a service from one VPC to another without VPC peering:
Consumer VPC → VPC Endpoint (Interface) → PrivateLink → NLB → Provider VPCBest for: Sharing specific services (APIs, databases) across accounts without full network connectivity. The consumer only accesses the specific service endpoint — not the provider’s entire VPC.
Hybrid Connectivity
AWS VPN
Encrypted tunnels over the public internet:
| Option | Bandwidth | Latency | Cost |
|---|---|---|---|
| Site-to-Site VPN | Up to 1.25 Gbps per tunnel | Variable (internet) | $0.05/hour + data transfer |
| Client VPN | Per-connection | Variable | $0.10/hour + $0.05/connection-hour |
Best for: Quick connectivity setup, backup for Direct Connect, remote developer access.
AWS Direct Connect
Dedicated network connection from your data center to AWS:
| Option | Bandwidth | Latency | Cost |
|---|---|---|---|
| Dedicated (1 Gbps, 10 Gbps, 100 Gbps) | Dedicated | Consistent, low | Port fee + data transfer |
| Hosted (50 Mbps - 10 Gbps) | Shared | Consistent, low | Partner pricing + data transfer |
Best for: Production hybrid workloads requiring consistent latency and high throughput. Financial services, healthcare, and any workload with data residency requirements.
High availability: Deploy Direct Connect connections in two different Direct Connect locations. Use VPN as a backup for Direct Connect.
Network Cost Optimization
Data Transfer Costs
Data transfer is often the largest hidden cost in AWS networking:
| Transfer Type | Cost |
|---|---|
| Inbound (internet → AWS) | Free |
| Same AZ | Free |
| Cross-AZ (same Region) | $0.01/GB each direction |
| Cross-Region | $0.02/GB |
| Internet outbound | $0.09/GB (first 10 TB) |
| NAT Gateway processing | $0.045/GB |
| VPC endpoint processing | $0.01/GB |
Cost reduction strategies:
- Keep communicating services in the same AZ when possible (free vs $0.02/GB cross-AZ)
- Use CloudFront for content delivery (cheaper outbound rates: $0.085/GB vs $0.09/GB, and cached content eliminates origin transfer)
- Deploy S3 and DynamoDB gateway endpoints (free, eliminates NAT Gateway charges)
- Use VPC endpoints for high-traffic AWS services
- Compress data in transit to reduce GB transferred
NAT Gateway Cost Reduction
NAT Gateways charge $0.045/GB for data processing. For workloads making heavy use of AWS services:
- Deploy S3 and DynamoDB gateway endpoints — Free, eliminates NAT charges for the two highest-volume services
- Deploy interface endpoints for ECR — Container image pulls from ECR through NAT are expensive; endpoint is cheaper for most workloads
- Deploy CloudWatch Logs endpoint — Log shipping volume can be significant
- Consolidate internet access — If multiple VPCs need internet access, route through a centralized NAT in a shared VPC via Transit Gateway
Monitoring
VPC Flow Logs
Enable VPC Flow Logs to capture network traffic metadata:
- Accepted traffic — Useful for understanding communication patterns and traffic volume
- Rejected traffic — Security monitoring (port scans, unauthorized access attempts)
- All traffic — Complete visibility (highest cost)
Send flow logs to S3 for long-term analysis or CloudWatch Logs for real-time alerting. Use Athena to query flow logs in S3 for network forensics.
Network Monitoring
Set CloudWatch alarms for:
- NAT Gateway
ErrorPortAllocation— NAT Gateway running out of ports (scale or split traffic) - NAT Gateway
BytesOutToDestination— Unexpected data transfer volume - Transit Gateway
BytesIn/BytesOut— Traffic volume anomalies - VPN
TunnelState— VPN tunnel down
Common Mistakes
Mistake 1: Insufficient CIDR Planning
Starting with a /24 VPC (256 IPs) and discovering you need more IPs after deploying 50 services. While you can add secondary CIDRs, the expanded range may overlap with other VPCs. Plan for growth with /16 VPCs from the start.
Mistake 2: Everything in Public Subnets
Placing application servers and databases in public subnets because “it is easier to access.” Every resource that does not need inbound internet traffic should be in a private subnet. Use ALBs for inbound traffic and NAT Gateways for outbound.
Mistake 3: No VPC Endpoints
Routing all AWS service traffic through NAT Gateways when gateway endpoints (S3, DynamoDB) are free. Deploy gateway endpoints in every VPC — they cost nothing and save significant NAT Gateway data processing fees.
Mistake 4: Single-AZ Deployment
Deploying all resources in a single Availability Zone. When that AZ has an incident (hardware failure, network issue), your entire application goes down. Always deploy across at least 2 AZs for production workloads.
Getting Started
VPC networking is the foundation that determines the security, connectivity, and cost characteristics of everything you build on AWS. Getting the network right during initial setup prevents expensive re-architecture later.
For network architecture design as part of your AWS architecture review, multi-account networking with Transit Gateway, or hybrid connectivity planning, talk to our team.



