Caching Strategies - Fundamentals of Performance Optimization

14 min read | 2024.12.26

What is Caching

Caching is a mechanism that temporarily stores copies of data in a location that can be accessed quickly. It reduces access to the original data source (database, API, etc.) and shortens response times.

Impact of Caching: If a database query takes 100ms, retrieval from cache can be completed in under 1ms.

Cache Layers

flowchart TB
    Browser["Browser Cache"] --> CDN["CDN Cache"]
    CDN --> App["Application Cache (Redis, etc.)"]
    App --> DBCache["Database Cache"]
    DBCache --> DB["Database"]

Caching Patterns

Cache-Aside

The application directly manages the cache and database.

async function getUser(userId) {
  // 1. Check cache
  const cached = await cache.get(`user:${userId}`);
  if (cached) {
    return JSON.parse(cached);
  }

  // 2. Cache miss: Get from DB
  const user = await db.users.findById(userId);

  // 3. Save to cache
  await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));

  return user;
}

Pros: Simple, resilient to failures Cons: Latency on cache miss

Read-Through

The cache itself handles data retrieval.

// Cache library configuration
const cache = new Cache({
  loader: async (key) => {
    // Called automatically on cache miss
    const userId = key.replace('user:', '');
    return await db.users.findById(userId);
  }
});

// Usage (Simple!)
const user = await cache.get(`user:${userId}`);

Write-Through

Updates cache and DB simultaneously on write.

async function updateUser(userId, data) {
  // Update DB
  const user = await db.users.update(userId, data);

  // Also update cache
  await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));

  return user;
}

Pros: High data consistency Cons: Increased write latency

Write-Behind

Writes to cache immediately, DB update is done asynchronously.

async function updateUser(userId, data) {
  // Update cache immediately
  await cache.setex(`user:${userId}`, 3600, JSON.stringify(data));

  // Add DB write to queue
  await writeQueue.add({ userId, data });

  return data;
}

// Background worker
writeQueue.process(async (job) => {
  await db.users.update(job.userId, job.data);
});

Pros: Fast writes Cons: Risk of data loss

Cache Invalidation

TTL (Time To Live)

Automatically expires after a certain time.

// Expires after 60 seconds
await cache.setex('key', 60, 'value');

Event-Based Invalidation

Explicitly delete cache when data is updated.

async function updateUser(userId, data) {
  await db.users.update(userId, data);

  // Invalidate related caches
  await cache.del(`user:${userId}`);
  await cache.del(`user:${userId}:profile`);
  await cache.del(`users:list`);
}

Pattern-Based Invalidation

// Delete all user-related caches
const keys = await cache.keys('user:123:*');
await cache.del(...keys);

Cache Problems and Solutions

Cache Stampede (Thundering Herd)

A problem where many requests simultaneously experience cache misses.

// Solution: Use locks
async function getWithLock(key, loader) {
  const cached = await cache.get(key);
  if (cached) return JSON.parse(cached);

  // Acquire lock
  const lockKey = `lock:${key}`;
  const locked = await cache.set(lockKey, '1', 'NX', 'EX', 10);

  if (!locked) {
    // Another process is loading → Wait and retry
    await sleep(100);
    return getWithLock(key, loader);
  }

  try {
    const data = await loader();
    await cache.setex(key, 3600, JSON.stringify(data));
    return data;
  } finally {
    await cache.del(lockKey);
  }
}

Probabilistic Early Recomputation

Probabilistically update cache before TTL expires.

async function getWithProbabilisticRefresh(key, loader, ttl) {
  const data = await cache.get(key);
  const remainingTtl = await cache.ttl(key);

  // If TTL is running low, probabilistically recompute
  if (data && remainingTtl < ttl * 0.1) {
    if (Math.random() < 0.1) {
      // 10% chance of background update
      loader().then(newData => {
        cache.setex(key, ttl, JSON.stringify(newData));
      });
    }
  }

  if (data) return JSON.parse(data);

  const newData = await loader();
  await cache.setex(key, ttl, JSON.stringify(newData));
  return newData;
}

Cache Key Design

// Good key design
const key = `user:${userId}:profile:v2`;

// Components:
// - Prefix: Entity type
// - Identifier: Unique ID
// - Sub-resource: Specific data
// - Version: Compatibility for schema changes

TTL Design Guidelines

Data TypeTTLReason
Static content1 day - 1 weekRarely changes
User profile1 - 24 hoursLow change frequency
Configuration5 - 30 minutesModerately updated
Real-time data1 - 5 minutesFrequently changes
Session30 min - 24 hoursBalance security and UX

Summary

Caching is a fundamental technique for performance optimization. By understanding patterns like Cache-Aside and Write-Through, and designing appropriate TTL and invalidation strategies, you can build fast and scalable systems. Consider the balance between caching complexity and benefits when implementing.

← Back to list