AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

A practical guide to ElastiCache Redis — caching patterns, data structures, cluster modes, eviction policies, and the strategies that reduce latency and database load in production applications.

Entity Definitions

ElastiCache
ElastiCache is an AWS service discussed in this article.

AWS ElastiCache Redis: Caching Strategies for Production

Cloud Architecture 8 min read

Quick summary: A practical guide to ElastiCache Redis — caching patterns, data structures, cluster modes, eviction policies, and the strategies that reduce latency and database load in production applications.

AWS ElastiCache Redis: Caching Strategies for Production
Table of Contents

Caching is the most cost-effective way to improve application performance. A single Redis cache node can serve hundreds of thousands of reads per second with sub-millisecond latency — orders of magnitude faster than any database query. For applications bottlenecked by database read latency or struggling under read-heavy traffic patterns, Redis caching transforms performance without re-architecting the application.

AWS ElastiCache for Redis provides managed Redis clusters that handle replication, failover, patching, and backup — the operational tasks that make self-managed Redis painful at scale. This guide covers the caching strategies and ElastiCache configurations that work in production.

When to Use Caching

Caching Makes Sense When

  • Read-heavy workloads — Your application reads far more than it writes (10:1 or higher read-to-write ratio)
  • Expensive queries — Database queries involve joins, aggregations, or full-text search that take 50ms+
  • Repeated access patterns — The same data is requested by multiple users (product pages, configuration, leaderboards)
  • Latency requirements — Your API must respond in under 50ms, and database queries take longer
  • Database bottleneck — Your RDS or DynamoDB read capacity is saturated and scaling the database is expensive

Caching Does Not Help When

  • Write-heavy workloads — If every request writes unique data, caching adds complexity without benefit
  • Unique queries — If every query is different (ad-hoc analytics, search with unique parameters), cache hit rates will be low
  • Strong consistency requirements — If stale data is never acceptable, caching introduces consistency complexity

Caching Patterns

Pattern 1: Cache-Aside (Lazy Loading)

The most common pattern — the application checks the cache first and falls back to the database on cache miss:

Read request
  → Check Redis cache
    → Cache hit → Return cached data (sub-millisecond)
    → Cache miss → Query database → Store result in Redis → Return data

Advantages:

  • Only caches data that is actually requested (no wasted memory)
  • Cache failures do not break the application (falls back to database)
  • Simple to implement

Disadvantages:

  • First request for each item hits the database (cold cache)
  • Stale data possible if database is updated without invalidating cache
  • Cache stampede risk when many concurrent requests miss the cache simultaneously

Implementation considerations:

  • Set a TTL (time-to-live) on every cached item to limit staleness
  • Implement cache invalidation on write operations
  • Use a mutex/lock for expensive queries to prevent cache stampede

Pattern 2: Write-Through

Write to both the cache and database simultaneously:

Write request
  → Write to Redis cache
  → Write to database
  → Return success

Advantages:

  • Cache is always up to date with the database
  • No stale data
  • Read requests always hit the cache (after initial population)

Disadvantages:

  • Every write has the overhead of two operations (cache + database)
  • Data that is written but never read still consumes cache memory
  • Cache contains data that may never be requested

Best for: Data that is frequently read after being written (user profiles, session data, configuration).

Pattern 3: Write-Behind (Write-Back)

Write to the cache immediately and asynchronously write to the database:

Write request
  → Write to Redis cache → Return success immediately
  → Background process → Write to database (async)

Advantages:

  • Lowest write latency (only cache write is synchronous)
  • Batches database writes for efficiency
  • Absorbs write spikes without database overload

Disadvantages:

  • Data loss risk if Redis fails before database write completes
  • Complex consistency management
  • Requires reliable background processing

Best for: High-throughput write workloads where slight data loss is acceptable (analytics counters, activity feeds, non-critical metrics).

Pattern 4: Read-Through with TTL Refresh

Automatically refresh cached data before TTL expires:

Background process
  → Scan for items approaching TTL expiry
  → Re-query database for fresh data
  → Update cache with fresh data
  → Users always see cached data (never hit database)

Best for: High-traffic items (homepage content, product catalogs) where cache misses cause noticeable latency and database load.

Redis Data Structures for Caching

Redis provides data structures beyond simple key-value storage. Choosing the right structure improves efficiency:

Data StructureUse CaseExample
StringSimple key-value cacheUser profile, API response, session data
HashObject with multiple fieldsUser: {name, email, role, lastLogin}
ListOrdered collection, recent itemsActivity feed, recent orders
SetUnique collection, membershipOnline users, unique visitors
Sorted SetRanked collectionLeaderboard, trending products
StreamEvent log, message queueActivity stream, change notifications

Practical Examples

Session storage (Hash):

HSET session:abc-123 userId "user-001" role "admin" tenant "acme" expiresAt "1720000000"
EXPIRE session:abc-123 3600

Leaderboard (Sorted Set):

ZADD leaderboard 1500 "player-001"
ZADD leaderboard 2300 "player-002"
ZREVRANGE leaderboard 0 9 WITHSCORES  # Top 10 players

Rate limiting (String with INCR):

INCR rate:user-001:2026-08-10T14:30
EXPIRE rate:user-001:2026-08-10T14:30 60  # 1-minute window
# Check: if count > 100, reject request

ElastiCache Configuration

Cluster Modes

Cluster Mode Disabled (single shard):

  • One primary node + up to 5 read replicas
  • All data on a single shard (limited by single node memory)
  • Simpler to manage
  • Max memory: 635.61 GB (r7g.16xlarge)

Cluster Mode Enabled (multiple shards):

  • Data partitioned across up to 500 shards
  • Each shard has a primary + up to 5 replicas
  • Total memory = shards × node memory (theoretically unlimited)
  • Supports online resharding (add/remove shards without downtime)

When to use Cluster Mode Enabled:

  • Dataset exceeds single node memory
  • Write throughput exceeds single primary capacity
  • You need online scaling (adding shards without downtime)

When Cluster Mode Disabled is sufficient:

  • Dataset fits in a single node
  • Read scaling via replicas is sufficient
  • Simpler operations preferred

Node Types

CategoryExampleUse Case
General Purpose (m7g)cache.m7g.largeBalanced workloads, most production use cases
Memory Optimized (r7g)cache.r7g.xlargeLarge datasets, high memory-to-CPU ratio
Small/Dev (t4g)cache.t4g.microDevelopment, testing, low-traffic production

Graviton (g suffix) instances provide 20-30% better price-performance than equivalent Intel instances. Always use Graviton for new deployments.

High Availability

  • Multi-AZ with automatic failover — Always enable for production. If the primary node fails, ElastiCache automatically promotes a replica to primary (failover time: typically 10-30 seconds).
  • Read replicas — Scale read capacity horizontally. Your application reads from replicas and writes to the primary.
  • Global Datastore — Cross-Region replication for disaster recovery and low-latency global reads.

Cache Invalidation

Cache invalidation is the hardest problem in caching. Stale data causes bugs; aggressive invalidation reduces cache hit rates.

TTL-Based Expiry

Set a TTL on every cached item:

Data TypeRecommended TTLRationale
Configuration5-15 minutesChanges infrequently, slight staleness acceptable
User profile1-5 minutesChanges occasionally, brief staleness tolerable
Product catalog15-60 minutesChanges via admin updates, not user-facing mutations
API response30-300 secondsDepends on data freshness requirements
Session data30-60 minutesMatch session timeout policy

Event-Based Invalidation

Invalidate cache entries when the underlying data changes:

Database write (DynamoDB Stream / RDS event)
  → Lambda function
  → Delete or update Redis cache entry

For DynamoDB, use DynamoDB Streams to trigger Lambda functions that invalidate corresponding cache entries. For RDS, use event notifications or application-level invalidation.

Tag-Based Invalidation

Group related cache entries with tags for bulk invalidation:

Cache entry: product:123 → tags: ["catalog", "category:electronics"]
Cache entry: product:456 → tags: ["catalog", "category:electronics"]

Invalidate: all entries tagged "category:electronics"
→ Deletes product:123 and product:456 simultaneously

Implement with Redis Sets: maintain a set per tag containing all keys associated with that tag.

ElastiCache Serverless

ElastiCache Serverless removes capacity planning entirely:

  • Automatically scales memory and compute based on usage
  • No node selection, no cluster management
  • Pay for data stored (per GB-hour) and compute (per ECPU)
  • Minimum charge applies ($0.125/hour ≈ $90/month)

When to use Serverless:

  • Unpredictable or spiky traffic patterns
  • New applications where cache sizing is unknown
  • Teams that want to avoid capacity planning

When to use provisioned nodes:

  • Predictable workloads where node sizing is known
  • Cost optimization with Reserved Nodes (up to 55% savings)
  • Requirements for specific node types or cluster configurations

Monitoring

Key CloudWatch Metrics

MetricTargetAction If Outside Target
CacheHitRate> 80%Low hit rate = wrong caching strategy or TTL
EngineCPUUtilization< 70%Scale up or add shards
DatabaseMemoryUsagePercentage< 80%Scale up or review eviction policy
CurrConnectionsBelow maxConnection pooling issue if near limit
ReplicationLag< 1 secondNetwork or replica capacity issue
EvictionsNear zeroMemory pressure if evictions increase

Set CloudWatch alarms for:

  • EngineCPUUtilization > 70% — Scale before performance degrades
  • DatabaseMemoryUsagePercentage > 80% — Scale before evictions begin
  • CacheHitRate < 50% — Investigate caching strategy

Cost Optimization

Right-Sizing

Monitor DatabaseMemoryUsagePercentage over 2 weeks. If consistently below 50%, you are paying for unused memory. Downsize to a smaller node type.

Reserved Nodes

For steady-state production caches, Reserved Nodes provide significant savings:

Payment Option1-Year Savings3-Year Savings
No upfront~28%~41%
Partial upfront~35%~50%
All upfront~38%~55%

Data Tiering

ElastiCache data tiering automatically moves less-frequently accessed data to SSD storage, reducing memory costs for large datasets:

  • Hot data stays in memory (sub-millisecond latency)
  • Warm data moves to SSD (single-digit millisecond latency)
  • Available on r6gd and r7gd node types

Common Mistakes

Mistake 1: Caching Without TTL

Cached data without a TTL lives forever — becoming stale as the source database changes. Always set a TTL. If you are unsure, start with 5 minutes and adjust based on your data’s change frequency and tolerance for staleness.

Mistake 2: No Connection Pooling

Creating a new Redis connection for every request is expensive. Use connection pooling in your application. For Lambda, initialize the Redis connection outside the handler function to reuse connections across invocations.

Mistake 3: Using Redis as Primary Storage

Redis is a cache, not a database. If your application cannot function when Redis is empty (cold start, failover, eviction), you have a cache dependency, not a caching strategy. Every cached item must be retrievable from the primary data store.

Mistake 4: Caching Too Much

Not all data benefits from caching. Data accessed once (unique search results, one-time API calls) wastes cache memory. Focus caching on frequently accessed, expensive-to-compute, or slowly changing data.

Getting Started

ElastiCache Redis fills the performance gap between your application and your database. For read-heavy serverless applications, high-traffic APIs, and latency-sensitive workloads, a well-implemented caching layer provides the single largest performance improvement available.

For caching architecture design, ElastiCache configuration, and performance optimization as part of our architecture review or managed services, talk to our team.

Contact us to optimize your application performance →

Ready to discuss your AWS strategy?

Our certified architects can help you implement these solutions.

Recommended Reading

Explore All Articles »
AWS Backup Strategies: Automated Data Protection

AWS Backup Strategies: Automated Data Protection

A practical guide to AWS Backup — backup plans, vault policies, cross-Region and cross-account copies, RPO/RTO alignment, and the data protection patterns that keep production workloads recoverable.

AWS Route 53: DNS and Traffic Management Patterns

AWS Route 53: DNS and Traffic Management Patterns

A practical guide to AWS Route 53 — hosted zones, routing policies, health checks, DNS failover, domain registration, and the traffic management patterns that make applications highly available.

AWS VPC Networking Best Practices for Production

AWS VPC Networking Best Practices for Production

A practical guide to AWS VPC networking — CIDR planning, subnet strategies, NAT gateways, VPC endpoints, Transit Gateway, and the network architecture patterns that scale with your organization.