AWS ElastiCache Redis: Caching Strategies for Production
Quick summary: A practical guide to ElastiCache Redis — caching patterns, data structures, cluster modes, eviction policies, and the strategies that reduce latency and database load in production applications.

Table of Contents
Caching is the most cost-effective way to improve application performance. A single Redis cache node can serve hundreds of thousands of reads per second with sub-millisecond latency — orders of magnitude faster than any database query. For applications bottlenecked by database read latency or struggling under read-heavy traffic patterns, Redis caching transforms performance without re-architecting the application.
AWS ElastiCache for Redis provides managed Redis clusters that handle replication, failover, patching, and backup — the operational tasks that make self-managed Redis painful at scale. This guide covers the caching strategies and ElastiCache configurations that work in production.
When to Use Caching
Caching Makes Sense When
- Read-heavy workloads — Your application reads far more than it writes (10:1 or higher read-to-write ratio)
- Expensive queries — Database queries involve joins, aggregations, or full-text search that take 50ms+
- Repeated access patterns — The same data is requested by multiple users (product pages, configuration, leaderboards)
- Latency requirements — Your API must respond in under 50ms, and database queries take longer
- Database bottleneck — Your RDS or DynamoDB read capacity is saturated and scaling the database is expensive
Caching Does Not Help When
- Write-heavy workloads — If every request writes unique data, caching adds complexity without benefit
- Unique queries — If every query is different (ad-hoc analytics, search with unique parameters), cache hit rates will be low
- Strong consistency requirements — If stale data is never acceptable, caching introduces consistency complexity
Caching Patterns
Pattern 1: Cache-Aside (Lazy Loading)
The most common pattern — the application checks the cache first and falls back to the database on cache miss:
Read request
→ Check Redis cache
→ Cache hit → Return cached data (sub-millisecond)
→ Cache miss → Query database → Store result in Redis → Return dataAdvantages:
- Only caches data that is actually requested (no wasted memory)
- Cache failures do not break the application (falls back to database)
- Simple to implement
Disadvantages:
- First request for each item hits the database (cold cache)
- Stale data possible if database is updated without invalidating cache
- Cache stampede risk when many concurrent requests miss the cache simultaneously
Implementation considerations:
- Set a TTL (time-to-live) on every cached item to limit staleness
- Implement cache invalidation on write operations
- Use a mutex/lock for expensive queries to prevent cache stampede
Pattern 2: Write-Through
Write to both the cache and database simultaneously:
Write request
→ Write to Redis cache
→ Write to database
→ Return successAdvantages:
- Cache is always up to date with the database
- No stale data
- Read requests always hit the cache (after initial population)
Disadvantages:
- Every write has the overhead of two operations (cache + database)
- Data that is written but never read still consumes cache memory
- Cache contains data that may never be requested
Best for: Data that is frequently read after being written (user profiles, session data, configuration).
Pattern 3: Write-Behind (Write-Back)
Write to the cache immediately and asynchronously write to the database:
Write request
→ Write to Redis cache → Return success immediately
→ Background process → Write to database (async)Advantages:
- Lowest write latency (only cache write is synchronous)
- Batches database writes for efficiency
- Absorbs write spikes without database overload
Disadvantages:
- Data loss risk if Redis fails before database write completes
- Complex consistency management
- Requires reliable background processing
Best for: High-throughput write workloads where slight data loss is acceptable (analytics counters, activity feeds, non-critical metrics).
Pattern 4: Read-Through with TTL Refresh
Automatically refresh cached data before TTL expires:
Background process
→ Scan for items approaching TTL expiry
→ Re-query database for fresh data
→ Update cache with fresh data
→ Users always see cached data (never hit database)Best for: High-traffic items (homepage content, product catalogs) where cache misses cause noticeable latency and database load.
Redis Data Structures for Caching
Redis provides data structures beyond simple key-value storage. Choosing the right structure improves efficiency:
| Data Structure | Use Case | Example |
|---|---|---|
| String | Simple key-value cache | User profile, API response, session data |
| Hash | Object with multiple fields | User: {name, email, role, lastLogin} |
| List | Ordered collection, recent items | Activity feed, recent orders |
| Set | Unique collection, membership | Online users, unique visitors |
| Sorted Set | Ranked collection | Leaderboard, trending products |
| Stream | Event log, message queue | Activity stream, change notifications |
Practical Examples
Session storage (Hash):
HSET session:abc-123 userId "user-001" role "admin" tenant "acme" expiresAt "1720000000"
EXPIRE session:abc-123 3600Leaderboard (Sorted Set):
ZADD leaderboard 1500 "player-001"
ZADD leaderboard 2300 "player-002"
ZREVRANGE leaderboard 0 9 WITHSCORES # Top 10 playersRate limiting (String with INCR):
INCR rate:user-001:2026-08-10T14:30
EXPIRE rate:user-001:2026-08-10T14:30 60 # 1-minute window
# Check: if count > 100, reject requestElastiCache Configuration
Cluster Modes
Cluster Mode Disabled (single shard):
- One primary node + up to 5 read replicas
- All data on a single shard (limited by single node memory)
- Simpler to manage
- Max memory: 635.61 GB (r7g.16xlarge)
Cluster Mode Enabled (multiple shards):
- Data partitioned across up to 500 shards
- Each shard has a primary + up to 5 replicas
- Total memory = shards × node memory (theoretically unlimited)
- Supports online resharding (add/remove shards without downtime)
When to use Cluster Mode Enabled:
- Dataset exceeds single node memory
- Write throughput exceeds single primary capacity
- You need online scaling (adding shards without downtime)
When Cluster Mode Disabled is sufficient:
- Dataset fits in a single node
- Read scaling via replicas is sufficient
- Simpler operations preferred
Node Types
| Category | Example | Use Case |
|---|---|---|
| General Purpose (m7g) | cache.m7g.large | Balanced workloads, most production use cases |
| Memory Optimized (r7g) | cache.r7g.xlarge | Large datasets, high memory-to-CPU ratio |
| Small/Dev (t4g) | cache.t4g.micro | Development, testing, low-traffic production |
Graviton (g suffix) instances provide 20-30% better price-performance than equivalent Intel instances. Always use Graviton for new deployments.
High Availability
- Multi-AZ with automatic failover — Always enable for production. If the primary node fails, ElastiCache automatically promotes a replica to primary (failover time: typically 10-30 seconds).
- Read replicas — Scale read capacity horizontally. Your application reads from replicas and writes to the primary.
- Global Datastore — Cross-Region replication for disaster recovery and low-latency global reads.
Cache Invalidation
Cache invalidation is the hardest problem in caching. Stale data causes bugs; aggressive invalidation reduces cache hit rates.
TTL-Based Expiry
Set a TTL on every cached item:
| Data Type | Recommended TTL | Rationale |
|---|---|---|
| Configuration | 5-15 minutes | Changes infrequently, slight staleness acceptable |
| User profile | 1-5 minutes | Changes occasionally, brief staleness tolerable |
| Product catalog | 15-60 minutes | Changes via admin updates, not user-facing mutations |
| API response | 30-300 seconds | Depends on data freshness requirements |
| Session data | 30-60 minutes | Match session timeout policy |
Event-Based Invalidation
Invalidate cache entries when the underlying data changes:
Database write (DynamoDB Stream / RDS event)
→ Lambda function
→ Delete or update Redis cache entryFor DynamoDB, use DynamoDB Streams to trigger Lambda functions that invalidate corresponding cache entries. For RDS, use event notifications or application-level invalidation.
Tag-Based Invalidation
Group related cache entries with tags for bulk invalidation:
Cache entry: product:123 → tags: ["catalog", "category:electronics"]
Cache entry: product:456 → tags: ["catalog", "category:electronics"]
Invalidate: all entries tagged "category:electronics"
→ Deletes product:123 and product:456 simultaneouslyImplement with Redis Sets: maintain a set per tag containing all keys associated with that tag.
ElastiCache Serverless
ElastiCache Serverless removes capacity planning entirely:
- Automatically scales memory and compute based on usage
- No node selection, no cluster management
- Pay for data stored (per GB-hour) and compute (per ECPU)
- Minimum charge applies ($0.125/hour ≈ $90/month)
When to use Serverless:
- Unpredictable or spiky traffic patterns
- New applications where cache sizing is unknown
- Teams that want to avoid capacity planning
When to use provisioned nodes:
- Predictable workloads where node sizing is known
- Cost optimization with Reserved Nodes (up to 55% savings)
- Requirements for specific node types or cluster configurations
Monitoring
Key CloudWatch Metrics
| Metric | Target | Action If Outside Target |
|---|---|---|
| CacheHitRate | > 80% | Low hit rate = wrong caching strategy or TTL |
| EngineCPUUtilization | < 70% | Scale up or add shards |
| DatabaseMemoryUsagePercentage | < 80% | Scale up or review eviction policy |
| CurrConnections | Below max | Connection pooling issue if near limit |
| ReplicationLag | < 1 second | Network or replica capacity issue |
| Evictions | Near zero | Memory pressure if evictions increase |
Set CloudWatch alarms for:
EngineCPUUtilization > 70%— Scale before performance degradesDatabaseMemoryUsagePercentage > 80%— Scale before evictions beginCacheHitRate < 50%— Investigate caching strategy
Cost Optimization
Right-Sizing
Monitor DatabaseMemoryUsagePercentage over 2 weeks. If consistently below 50%, you are paying for unused memory. Downsize to a smaller node type.
Reserved Nodes
For steady-state production caches, Reserved Nodes provide significant savings:
| Payment Option | 1-Year Savings | 3-Year Savings |
|---|---|---|
| No upfront | ~28% | ~41% |
| Partial upfront | ~35% | ~50% |
| All upfront | ~38% | ~55% |
Data Tiering
ElastiCache data tiering automatically moves less-frequently accessed data to SSD storage, reducing memory costs for large datasets:
- Hot data stays in memory (sub-millisecond latency)
- Warm data moves to SSD (single-digit millisecond latency)
- Available on r6gd and r7gd node types
Common Mistakes
Mistake 1: Caching Without TTL
Cached data without a TTL lives forever — becoming stale as the source database changes. Always set a TTL. If you are unsure, start with 5 minutes and adjust based on your data’s change frequency and tolerance for staleness.
Mistake 2: No Connection Pooling
Creating a new Redis connection for every request is expensive. Use connection pooling in your application. For Lambda, initialize the Redis connection outside the handler function to reuse connections across invocations.
Mistake 3: Using Redis as Primary Storage
Redis is a cache, not a database. If your application cannot function when Redis is empty (cold start, failover, eviction), you have a cache dependency, not a caching strategy. Every cached item must be retrievable from the primary data store.
Mistake 4: Caching Too Much
Not all data benefits from caching. Data accessed once (unique search results, one-time API calls) wastes cache memory. Focus caching on frequently accessed, expensive-to-compute, or slowly changing data.
Getting Started
ElastiCache Redis fills the performance gap between your application and your database. For read-heavy serverless applications, high-traffic APIs, and latency-sensitive workloads, a well-implemented caching layer provides the single largest performance improvement available.
For caching architecture design, ElastiCache configuration, and performance optimization as part of our architecture review or managed services, talk to our team.


