Saturday, August 30, 2025

Cache in front of a slow database ?

 

Should You Front a Slow Database with a Cache?

Most of us have been there: a slow database query is dragging down response times, dashboards are red, and someone says, “Let’s put Redis in front of it.”

I have done it myself for an advertising system that needed response times of less than 30 ms. It worked very well.

It’s a tried-and-true trick. Caching can take a query that costs hundreds of milliseconds and make it return in single-digit milliseconds. It reduces load on your database and makes your system feel “snappy.” But caching isn’t free — it introduces its own problems that engineers need to be very deliberate about.




Good Use Cases for Caching

  • Read-heavy workloads
    When the same data is read far more often than it’s written. For example, product catalogs, user profiles, or static metadata.

  • Expensive computations
    Search queries, aggregated analytics, or personalized recommendations where computing results on the fly is costly.

  • Burst traffic
    Handling sudden spikes (sales events, sports highlights, viral posts) where the database alone cannot keep up.

  • Low latency requirements
    Some systems have low latency requirements. Clients need a response is say less than 50 ms or client aborts.


The Catch: Cache Consistency

The hardest part of caching isn’t adding Redis or Memcached — it’s keeping the cache in sync with the database.

Here are the main consistency issues you’ll face:

  1. Stale Data
    If the cache isn’t updated when the database changes, users may see outdated results.
    Example: A user updates their shipping address, but the checkout flow still shows the old one because it’s cached.

  2. Cache Invalidation
    The classic hard problem: When do you expire cache entries? Too soon → database load spikes. Too late → users see stale values.

  3. Race Conditions
    Writes may hit the database while another process is still serving old cache data. Without careful ordering, you risk “losing” updates.


Common Strategies

  • Cache Aside (Lazy Loading)
    Application checks cache → if miss, fetch from DB → populate cache.
    ✅ Simple, common.
    ❌ Risk of stale data unless you also invalidate on updates.

  • Write-Through
    Writes always go through the cache → cache updates DB.
    ✅ Consistency is better.
    ❌ Higher write latency, more complexity.

  • Write-Behind
    Writes update the cache, and DB updates happen asynchronously.
    ✅ Fast writes.
    ❌ Risk of data loss if cache fails before DB is updated.

  • Time-to-Live (TTL)
    Expire cache entries after a set period.
    ✅ Easy safety net.
    ❌ Not precise; stale reads possible until expiry.


So, Is It Worth It?

If your workload is read-heavy, latency-sensitive, and relatively tolerant of eventual consistency, caching is usually a big win.

But if your workload is write-heavy or requires strict consistency (think payments, inventory, or medical records), caching can create more problems than it solves.

The lesson: don’t add Redis or Memcached just because they’re shiny tools. Add them because you’ve carefully measured your system, know where the bottleneck is, and can live with the consistency trade-offs.


Takeaway:
Caching is like nitrous oxide for your system — it can make things blazing fast, but you need to handle it with care or you’ll blow the engine.

No comments:

Post a Comment