Caching Pitfalls Every Developer Should Know

Показать описание

Animation tools: Adobe Illustrator and After Effects.

Checkout our bestselling System Design Interview books:

ABOUT US:
Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.

ByteByteGo

Рекомендации по теме

Комментарии

Cache invalidation is the elephant in the room

IceQub

1. Cache Stampede🐘: single data not found in cache
a. lock when requesting data directly from DB after request missed in cache
b. external processor: either proactively or reactively updates the expiring/ expired data in cache
c. probabilistic early expiration: each request might trigger early refresh of data in cache
2. Cache Penetration🏹: data does not exist in either cache or DB
a. placeholder for non-existing data
b. bloom filter (can only tell you the data does not exist, cannot tell you the data exists
3. Cache Avalanche🏔: large amount of data cannot be found in cache
a. circuit breaker on both cache and DB
b. cache cluster instead of cache so when one part is down, the other parts still remain online
c. cache pre-warming: when starting an app, fill in cache before starting service

yuxueyuan

All I know is there are 2 hard things about programming:
- naming things
- cache invalidation
- off-by-one errors

dave

I really appreciate your video. They are so high quality with the best explanation.
Can you please make a video of what is the best strategy for using a relational database like Microsoft SQL or what else together with elastic search.
How to keep them synchronized? Thank you in advance

Kingside

Caches work well in tandem with claims, that is some data should never be cached, but rather claimed by a user during the editing process, with ability for another user to revoke the claim. Saves always check claim status first.
Multi level cache invalidating is important. For example, products returned for a catalog can use a product cache with a polled cache invalidating process. Yet, for display on a product detail page a quick check of a timestamp is done which is reset if any of the various product tables are updated by a trigger. Thus a quick scaler db query ensures the cache is valid, which also helps protect the cache itself from becoming stale.
This can be taken even further with a dedicated timestamp table for complex product information. This approach dramatically improves performance, always guaranteeing valid data on a detail page, while keeping catalogs valid within the timespan of the cache manager for polling a cache invalidation table for changes.

michaelkhalsa

Stampede Vs Avalanche: Awesome explanation. Kindly use real time example to correlate with caching issues. That will help tremendously.

gorakh

The most important knowledge about caching is: don't cache unless it's absolutely necessary.
Also, database in conjunction with read-replicas can be much more resilient and performant than your homebrew crappy caching mechanism.
Caching is final resort, after you've tried everything you can do with database, for example, tuning queries, adding indexes etc.

ANONAAAAAAAAA

Use jitter to the TTL to reduce cache avalanche and many related issues

rogers.

I'm surprised that there is no mention of a easy solution (albeit there might still be an issue when starting from a cold cache) for a Avalanche/Stampede: Just use different caching times. That should somewhat alleviate when the database is hit with multiple requests. But in essence, only cache what is necessary.

BigHalfSteps

Thanks for the insights on cache management. Could you plz suggest keys in cache need encryption? What should be the key flush time.

mailbrn

What if we add some kind of jitter for the key TTL, so we minimize the probability of having them expired at the same time?

vladyslavlen

at 0:27 on the left diagram, shouldn't the process order be :
1. Data request
2. Data response (no cached data)
3. Read original
4. Copy cache

simonbernard

Would love to see you tackle cache consistency too: what happens when the database write succeeds but the cache write fails? Or if the database is written concurrently to 2 different values but the last write to the database was value a, while the last write to the cache was value b? Now the cache is forever inconsistent.

toxicitysocks

At 3:58 the "find key" arrow has a typo and it should be (3) instead of (4)

devid

How can there be less than 1m subscribers to your channel? You have the best explanations

micahpezdirtz

to use a bloomfilter it sounds like easy, but it can't delete element yeah? will it need to refresh the bloomfilter after something to ignore deleted itens?

eduardokuroda

2:33 a hidden smiling gem at the bottom right

NemiroIlia

If something creates traffic, have a traffic signal, ok.. a lock here... regulate..
If something creates traffic, have multiple systems (web server/cache server etc) to handle it..
If something can fail, have redundant backups of THAT..

this applies to anything..

also.. to know if cache could fail in DB caz relevant answer not in DB, note down that before in some way..

parthi

Can't we use request collapsing to prevent stampede? As it maily due to expired cache entry and multiple requests are trying to access the same resource?

PranaySoniKumar

Question:

What's the point of a cache server? Why the server itself / webserver is not doing the caching?

If caching is supposed to be for fast retrieval, if we store it in a different server, won't the network call take more time than querying a db in the first place?

kazama

Caching Pitfalls Every Developer Should Know

Caching Pitfalls Every Developer Should Know

Cache Systems Every Developer Should Know

'Lazy Defenses: Using Scaled TTLs to Keep Your Cache Correct' by Bonnie Eisenman

This is why you need caching

Cache Tips for WordPress Developers - Less and Faster

How does Caching on the Backend work? (System Design Fundamentals)

The Promises and Pitfalls of Drupal's Cache System - Kelly Lucas - DrupalCamp Poland 2019

System Design Basics Every Software Engineer MUST know| Write back Cache| Getting started on Caching

Turning Caches into Distributed Systems with mcrouter - Data@Scale

'Caching at Netflix: The Hidden Microservice' by Scott Mansfield

Why #application level cache in #microservices is a bad idea !

DjangoCon Europe 2023 | Caching everywhere

Solving the Hard Problems – Options for Invalidating the NGINX Cache | Etsy

Parity Technologies: Caching The Trie | Sub0 2022

How will you handle clearing the cache

5 Caching Strategies | System Design Interview | Cache Aside, Read through, Write through | Patterns

How caching helps! | codeLive

Caching beyond simple put and gets by Jakub Marchwicki

The Promises and Pitfalls of Drupal’s Cache System

Caching is hard

The Tale of Two Caches

Caching In The Backend Systems: A Beginner's Guide

Caching with Spring: Advanced Topics and Best Practices - Michael Plöd @ Spring I/O 2016

What is Caching in Hindi (Complete Explanation)