AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization.

Key Facts

  • A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization
  • A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization

Entity Definitions

VPC
VPC is an AWS service discussed in this article.

AWS VPC Networking Best Practices for Production

Cloud Architecture 8 min read

Quick summary: A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization.

Key Takeaways

  • A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization
  • A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization
AWS VPC Networking Best Practices for Production
Table of Contents

Networking is the foundation that every other AWS service runs on. A well-designed VPC provides security isolation, predictable routing, and the flexibility to grow without re-architecting. A poorly designed VPC leads to overlapping IP ranges that prevent connectivity, public-facing resources that should be private, and networking costs that grow faster than the workloads they support.

Most networking mistakes are made during initial setup and are expensive to fix later. This guide covers the decisions that matter — CIDR planning, subnet strategy, connectivity patterns, and cost optimization — so you get the network right the first time.

VPC Design

CIDR Planning

CIDR (Classless Inter-Domain Routing) defines your VPC’s IP address range. Getting this right is critical because VPC CIDRs cannot overlap if you need to connect VPCs together, and expanding a VPC CIDR after creation has limitations.

Recommended CIDR ranges:

EnvironmentCIDRUsable IPsRationale
Production10.0.0.0/1665,536Room for growth, many subnets
Staging10.1.0.0/1665,536Mirrors production for testing
Development10.2.0.0/1665,536Developer workloads
Shared Services10.10.0.0/1665,536CI/CD, DNS, shared tools

Rules:

  • Use /16 for production VPCs — smaller ranges limit future growth
  • Never overlap CIDRs across VPCs that might need to communicate
  • Reserve ranges for on-premises connectivity (avoid 10.0.0.0/8 if your data center uses it)
  • Document your CIDR allocation in a central registry

In a multi-account organization, plan CIDRs centrally to prevent overlaps across accounts.

Subnet Strategy

Three-tier architecture:

VPC: 10.0.0.0/16
├── Public Subnets (internet-facing)
│   ├── 10.0.1.0/24 (AZ-a) — ALB, NAT Gateway, bastion hosts
│   ├── 10.0.2.0/24 (AZ-b)
│   └── 10.0.3.0/24 (AZ-c)
├── Private Subnets (application tier)
│   ├── 10.0.11.0/24 (AZ-a) — ECS tasks, Lambda, EC2 instances
│   ├── 10.0.12.0/24 (AZ-b)
│   └── 10.0.13.0/24 (AZ-c)
└── Data Subnets (database tier)
    ├── 10.0.21.0/24 (AZ-a) — RDS, ElastiCache, OpenSearch
    ├── 10.0.22.0/24 (AZ-b)
    └── 10.0.23.0/24 (AZ-c)

Three Availability Zones — Always deploy across at least 2 AZs for high availability. Three AZs provide better fault tolerance and are required for some services (e.g., Amazon MSK, Aurora Multi-AZ with 2 readers).

Public subnets have a route to the Internet Gateway. Only resources that must receive inbound traffic from the internet should be here — ALBs, NAT Gateways, and (rarely) bastion hosts.

Private subnets have a route to a NAT Gateway (for outbound internet access) but no inbound internet route. Application workloads live here.

Data subnets have no internet access at all — no NAT Gateway route. Databases should never initiate outbound internet connections. AWS service access uses VPC endpoints.

Security Groups vs NACLs

Security Groups — Stateful, instance-level firewall. The primary access control mechanism:

RuleSourcePortPurpose
ALB → ApplicationALB security group8080Application traffic
Application → DatabaseApplication security group5432Database queries
Application → RedisApplication security group6379Cache access

Best practice: Reference security groups by ID (not CIDR) whenever possible. sg-abc123 is self-documenting and adapts automatically when instances are added or removed.

NACLs (Network ACLs) — Stateless, subnet-level firewall. Use as a secondary defense layer:

  • Default NACLs allow all traffic — do not rely on them for security
  • Custom NACLs block known malicious IP ranges or restrict traffic between subnet tiers
  • NACLs require explicit allow rules for both inbound and outbound (stateless)

Recommendation: Use security groups as the primary control. Add NACLs only for specific requirements (blocking IP ranges, enforcing subnet-level restrictions for compliance).

Internet Connectivity

NAT Gateways

NAT Gateways provide outbound internet access for private subnet resources (package updates, API calls, SaaS integrations):

Cost:

  • $0.045/hour per NAT Gateway = $32.40/month
  • $0.045/GB data processed

High availability: Deploy one NAT Gateway per AZ. If AZ-a’s NAT Gateway fails, resources in AZ-a lose internet access — but resources in AZ-b and AZ-c continue normally.

Cost optimization:

  • A single NAT Gateway in one AZ works for development environments ($32/month vs $97/month for three)
  • Use VPC endpoints for AWS service traffic to reduce NAT Gateway data processing charges
  • S3 and DynamoDB Gateway endpoints are free — always deploy them

VPC Endpoints

VPC endpoints provide private connectivity to AWS services without traversing the internet or NAT Gateway:

Gateway endpoints (free):

  • S3
  • DynamoDB

Always deploy these. They are free and reduce NAT Gateway costs.

Interface endpoints ($0.01/hour per AZ + $0.01/GB):

  • ECR (for container image pulls)
  • CloudWatch Logs (for log shipping)
  • STS (for IAM role assumption)
  • Secrets Manager / SSM Parameter Store
  • KMS
  • SQS, SNS, EventBridge
  • Lambda, Step Functions

Cost analysis: An interface endpoint in 3 AZs costs ~$21.60/month. If your workload processes more than 480 GB/month through NAT Gateway to reach that service, the endpoint is cheaper. For high-traffic services (ECR, CloudWatch Logs), endpoints almost always save money.

Security benefit: VPC endpoints keep traffic within the AWS network. Data never traverses the public internet, reducing the attack surface.

Multi-VPC Connectivity

VPC Peering

Point-to-point connectivity between two VPCs:

VPC A ←→ VPC B (peering connection)

Advantages: Simple, no additional cost (data transfer charges only), low latency.

Limitations: Not transitive (VPC A ↔ B and VPC B ↔ C does not mean VPC A ↔ C). For more than 3-4 VPCs, peering creates an unmanageable mesh.

Best for: Connecting 2-3 VPCs in simple architectures.

Transit Gateway

Hub-and-spoke connectivity for multiple VPCs and on-premises networks:

VPC A ───┐
VPC B ───┤
VPC C ───┼─── Transit Gateway ─── On-premises (VPN / Direct Connect)
VPC D ───┤
VPC E ───┘

Advantages:

  • Centralized routing — one hub connects all VPCs
  • Transitive routing — any VPC can reach any other VPC through the hub
  • VPN and Direct Connect integration
  • Route tables for segmentation (production VPCs cannot reach development VPCs)
  • Cross-Region peering for multi-Region architectures

Cost: $0.05/hour per attachment + $0.02/GB data processed. A Transit Gateway with 5 VPC attachments costs ~$180/month before data transfer.

Best for: Multi-account organizations with 4+ VPCs, hybrid connectivity, or network segmentation requirements.

Expose a service from one VPC to another without VPC peering:

Consumer VPC → VPC Endpoint (Interface) → PrivateLink → NLB → Provider VPC

Best for: Sharing specific services (APIs, databases) across accounts without full network connectivity. The consumer only accesses the specific service endpoint — not the provider’s entire VPC.

Hybrid Connectivity

AWS VPN

Encrypted tunnels over the public internet:

OptionBandwidthLatencyCost
Site-to-Site VPNUp to 1.25 Gbps per tunnelVariable (internet)$0.05/hour + data transfer
Client VPNPer-connectionVariable$0.10/hour + $0.05/connection-hour

Best for: Quick connectivity setup, backup for Direct Connect, remote developer access.

AWS Direct Connect

Dedicated network connection from your data center to AWS:

OptionBandwidthLatencyCost
Dedicated (1 Gbps, 10 Gbps, 100 Gbps)DedicatedConsistent, lowPort fee + data transfer
Hosted (50 Mbps - 10 Gbps)SharedConsistent, lowPartner pricing + data transfer

Best for: Production hybrid workloads requiring consistent latency and high throughput. Financial services, healthcare, and any workload with data residency requirements.

High availability: Deploy Direct Connect connections in two different Direct Connect locations. Use VPN as a backup for Direct Connect.

Network Cost Optimization

Data Transfer Costs

Data transfer is often the largest hidden cost in AWS networking:

Transfer TypeCost
Inbound (internet → AWS)Free
Same AZFree
Cross-AZ (same Region)$0.01/GB each direction
Cross-Region$0.02/GB
Internet outbound$0.09/GB (first 10 TB)
NAT Gateway processing$0.045/GB
VPC endpoint processing$0.01/GB

Cost reduction strategies:

  • Keep communicating services in the same AZ when possible (free vs $0.02/GB cross-AZ)
  • Use CloudFront for content delivery (cheaper outbound rates: $0.085/GB vs $0.09/GB, and cached content eliminates origin transfer)
  • Deploy S3 and DynamoDB gateway endpoints (free, eliminates NAT Gateway charges)
  • Use VPC endpoints for high-traffic AWS services
  • Compress data in transit to reduce GB transferred

NAT Gateway Cost Reduction

NAT Gateways charge $0.045/GB for data processing. For workloads making heavy use of AWS services:

  1. Deploy S3 and DynamoDB gateway endpoints — Free, eliminates NAT charges for the two highest-volume services
  2. Deploy interface endpoints for ECR — Container image pulls from ECR through NAT are expensive; endpoint is cheaper for most workloads
  3. Deploy CloudWatch Logs endpoint — Log shipping volume can be significant
  4. Consolidate internet access — If multiple VPCs need internet access, route through a centralized NAT in a shared VPC via Transit Gateway

Monitoring

VPC Flow Logs

Enable VPC Flow Logs to capture network traffic metadata:

  • Accepted traffic — Useful for understanding communication patterns and traffic volume
  • Rejected traffic — Security monitoring (port scans, unauthorized access attempts)
  • All traffic — Complete visibility (highest cost)

Send flow logs to S3 for long-term analysis or CloudWatch Logs for real-time alerting. Use Athena to query flow logs in S3 for network forensics.

Network Monitoring

Set CloudWatch alarms for:

  • NAT Gateway ErrorPortAllocation — NAT Gateway running out of ports (scale or split traffic)
  • NAT Gateway BytesOutToDestination — Unexpected data transfer volume
  • Transit Gateway BytesIn/BytesOut — Traffic volume anomalies
  • VPN TunnelState — VPN tunnel down

Common Mistakes

Mistake 1: Insufficient CIDR Planning

Starting with a /24 VPC (256 IPs) and discovering you need more IPs after deploying 50 services. While you can add secondary CIDRs, the expanded range may overlap with other VPCs. Plan for growth with /16 VPCs from the start.

Mistake 2: Everything in Public Subnets

Placing application servers and databases in public subnets because “it is easier to access.” Every resource that does not need inbound internet traffic should be in a private subnet. Use ALBs for inbound traffic and NAT Gateways for outbound.

Mistake 3: No VPC Endpoints

Routing all AWS service traffic through NAT Gateways when gateway endpoints (S3, DynamoDB) are free. Deploy gateway endpoints in every VPC — they cost nothing and save significant NAT Gateway data processing fees.

Mistake 4: Single-AZ Deployment

Deploying all resources in a single Availability Zone. When that AZ has an incident (hardware failure, network issue), your entire application goes down. Always deploy across at least 2 AZs for production workloads.

Getting Started

VPC networking is the foundation that determines the security, connectivity, and cost characteristics of everything you build on AWS. Getting the network right during initial setup prevents expensive re-architecture later.

For network architecture design as part of your AWS architecture review, multi-account networking with Transit Gateway, or hybrid connectivity planning, talk to our team.

Contact us to design your network architecture →

Ready to discuss your AWS strategy?

Our certified architects can help you implement these solutions.

Recommended Reading

Explore All Articles »
AWS Backup Strategies: Automated Data Protection

AWS Backup Strategies: Automated Data Protection

A practical guide to AWS Backup — backup plans, vault policies, cross-Region and cross-account copies, RPO/RTO alignment, and the data protection patterns that keep production workloads recoverable.

AWS Route 53: DNS and Traffic Management Patterns

AWS Route 53: DNS and Traffic Management Patterns

A practical guide to AWS Route 53 — hosted zones, routing policies, health checks, DNS failover, domain registration, and the traffic management patterns that make applications highly available.