cache-strategy
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese/cache-strategy
/缓存策略
Implement a permanent cache-first strategy: read from cache, fall back to DB on miss, write to cache, invalidate only on data mutation. No expiry timers — the cache stays valid until the underlying data changes.
实现永久缓存优先策略:先从缓存读取,缓存未命中时回退到数据库查询,查询后写入缓存,仅在数据变更时失效。无过期计时器——缓存会保持有效,直到底层数据发生变更。
Philosophy
设计理念
The goal is one DB read per data item, ever (until it changes):
READ: cache hit? → return cached value
cache miss? → query DB → write to cache → return value
WRITE: update DB → invalidate (or update) cache entry → doneThis is the Play Framework model: no TTL, no polling, no background refresh. The cache is always warm and always correct because it is invalidated precisely when the source changes.
Why permanent over TTL?
- TTL caches trade stale reads for simplicity — every expiry is a guaranteed DB hit even if nothing changed
- Permanent caches with invalidation give you zero unnecessary DB hits and zero stale reads
- TTL is only appropriate when you can't know when data changes (third-party APIs, etc.)
目标是每个数据项仅读取一次数据库(直到数据变更):
读取操作: 缓存命中? → 返回缓存值
缓存未命中? → 查询数据库 → 写入缓存 → 返回值
写入操作: 更新数据库 → 失效(或更新)缓存条目 → 完成这是Play Framework的缓存模型:无TTL、无轮询、无后台刷新。缓存始终处于预热状态且始终准确,因为它会在数据源变更时精准失效。
为什么选择永久缓存而非TTL缓存?
- TTL缓存以陈旧读取为代价换取简单性——每次过期都会触发一次数据库查询,即使数据没有任何变化
- 带失效机制的永久缓存可避免所有不必要的数据库查询,且不会出现陈旧读取
- 仅当无法知晓数据何时变更时(如第三方API等场景),TTL才是合适的选择
Step 1: Detect cache infrastructure
步骤1:检测缓存基础设施
bash
undefinedbash
undefinedCheck for Redis
检查Redis
grep -rn "redis|Redis|REDIS_URL" --include=".json" --include=".env*" --include=".yml" --include=".yaml" --include=".rb" --include=".py" --include=".js" --include=".ts" . 2>/dev/null | grep -v node_modules | head -10
grep -rn "redis|Redis|REDIS_URL" --include=".json" --include=".env*" --include=".yml" --include=".yaml" --include=".rb" --include=".py" --include=".js" --include=".ts" . 2>/dev/null | grep -v node_modules | head -10
Check for Memcached
检查Memcached
grep -rn "memcached|Memcached|MEMCACHE" --include=".json" --include=".env*" --include="*.yml" . 2>/dev/null | grep -v node_modules | head -5
grep -rn "memcached|Memcached|MEMCACHE" --include=".json" --include=".env*" --include="*.yml" . 2>/dev/null | grep -v node_modules | head -5
Check for in-process cache (node-cache, lru-cache, etc.)
检查进程内缓存(node-cache、lru-cache等)
grep -rn "lru-cache|node-cache|memory-cache|caffeine|guava.cache" --include=".json" --include=".js" --include=".ts" --include="*.java" . 2>/dev/null | grep -v node_modules | head -5
Report what's available. If no cache layer exists, recommend Redis as the default and offer to add the client library.grep -rn "lru-cache|node-cache|memory-cache|caffeine|guava.cache" --include=".json" --include=".js" --include=".ts" --include="*.java" . 2>/dev/null | grep -v node_modules | head -5
报告可用的缓存组件。如果不存在缓存层,推荐Redis作为默认选项,并提供添加客户端库的方案。Step 2: Identify cacheable DB reads
步骤2:识别可缓存的数据库读取操作
Scan for DB query patterns and classify by cacheability:
High-value cache candidates (read often, change rarely):
- User profiles, settings, preferences
- Product/item details
- Configuration tables
- Reference data (categories, tags, enums stored in DB)
- Aggregates that are expensive to compute (counts, sums)
Poor cache candidates (skip these):
- Real-time data (live inventory counts, chat messages)
- Per-session data
- Data that changes on every write (e.g., counters that increment on every request)
bash
undefined扫描数据库查询模式,并按可缓存性分类:
高价值缓存候选对象(频繁读取、极少变更):
- 用户资料、设置、偏好
- 产品/商品详情
- 配置表
- 参考数据(存储在数据库中的分类、标签、枚举)
- 计算成本高的聚合数据(计数、求和)
低价值缓存候选对象(跳过这些):
- 实时数据(实时库存数量、聊天消息)
- 会话级数据
- 每次写入都会变更的数据(例如,每次请求都会递增的计数器)
bash
undefinedFind the most-called query patterns
查找调用最频繁的查询模式
grep -rn ".find\b|.findById|.findOne|.where|SELECT"
--include=".rb" --include=".py" --include=".js" --include=".ts"
. 2>/dev/null | grep -v node_modules | grep -v test | grep -v spec | head -30
--include=".rb" --include=".py" --include=".js" --include=".ts"
. 2>/dev/null | grep -v node_modules | grep -v test | grep -v spec | head -30
For each candidate, note:
- Cache key pattern (e.g., `user:{id}`, `product:{slug}`)
- Where it's written/updated (to know where to place invalidation)grep -rn ".find\b|.findById|.findOne|.where|SELECT"
--include=".rb" --include=".py" --include=".js" --include=".ts"
. 2>/dev/null | grep -v node_modules | grep -v test | grep -v spec | head -30
--include=".rb" --include=".py" --include=".js" --include=".ts"
. 2>/dev/null | grep -v node_modules | grep -v test | grep -v spec | head -30
针对每个候选对象,记录:
- 缓存键模式(例如:`user:{id}`、`product:{slug}`)
- 数据写入/更新的位置(以便确定放置失效逻辑的位置)Step 3: Implement cache-first reads
步骤3:实现缓存优先读取
For each cacheable read, apply this pattern:
针对每个可缓存的读取操作,应用以下模式:
Redis (Node.js / TypeScript)
Redis(Node.js / TypeScript)
typescript
async function getUser(id: string) {
const cacheKey = `user:${id}`;
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
const user = await db.users.findById(id);
await redis.set(cacheKey, JSON.stringify(user)); // no TTL — permanent until invalidated
return user;
}typescript
async function getUser(id: string) {
const cacheKey = `user:${id}`;
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
const user = await db.users.findById(id);
await redis.set(cacheKey, JSON.stringify(user)); // 无TTL — 永久有效,直到失效
return user;
}Redis (Ruby / Rails)
Redis(Ruby / Rails)
ruby
def find_user(id)
Rails.cache.fetch("user:#{id}") do
User.find(id) # only called on cache miss
end
endruby
def find_user(id)
Rails.cache.fetch("user:#{id}") do
User.find(id) # 仅在缓存未命中时调用
end
endRedis (Python)
Redis(Python)
python
def get_user(user_id: int):
cache_key = f"user:{user_id}"
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
user = db.session.get(User, user_id)
redis_client.set(cache_key, json.dumps(user.to_dict())) # no expiry
return userpython
def get_user(user_id: int):
cache_key = f"user:{user_id}"
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
user = db.session.get(User, user_id)
redis_client.set(cache_key, json.dumps(user.to_dict())) # 无过期时间
return userPlay Framework (Scala)
Play Framework(Scala)
scala
def getUser(id: Long): Future[User] = {
cache.getOrElseUpdate(s"user:$id") {
userRepository.findById(id)
}
}scala
def getUser(id: Long): Future[User] = {
cache.getOrElseUpdate(s"user:$id") {
userRepository.findById(id)
}
}Step 4: Implement cache invalidation on writes
步骤4:在写入操作时实现缓存失效
For every place that mutates cached data, add invalidation immediately after the DB write succeeds:
在所有会修改缓存数据的位置,数据库写入成功后立即添加失效逻辑:
Node.js / TypeScript
Node.js / TypeScript
typescript
async function updateUser(id: string, data: Partial<User>) {
const user = await db.users.update(id, data);
await redis.del(`user:${id}`); // invalidate on change
return user;
}typescript
async function updateUser(id: string, data: Partial<User>) {
const user = await db.users.update(id, data);
await redis.del(`user:${id}`); // 数据变更时失效
return user;
}Ruby / Rails
Ruby / Rails
ruby
def update_user(id, attrs)
user = User.find(id).tap { |u| u.update!(attrs) }
Rails.cache.delete("user:#{id}")
user
endImportant: invalidate on delete too:
typescript
async function deleteUser(id: string) {
await db.users.delete(id);
await redis.del(`user:${id}`);
}For lists/collections, invalidate the collection key when any item in it changes:
typescript
await redis.del(`user:${id}`); // the specific item
await redis.del(`users:list`); // any cached list that included this itemruby
def update_user(id, attrs)
user = User.find(id).tap { |u| u.update!(attrs) }
Rails.cache.delete("user:#{id}")
user
end重要提示: 删除操作时也要失效缓存:
typescript
async function deleteUser(id: string) {
await db.users.delete(id);
await redis.del(`user:${id}`);
}针对列表/集合,当其中任意项变更时,失效集合对应的缓存键:
typescript
await redis.del(`user:${id}`); // 特定项
await redis.del(`users:list`); // 包含该项的所有缓存列表Step 5: Cache key conventions
步骤5:缓存键命名规范
Use a consistent key naming scheme across the codebase:
<entity>:<identifier> → user:123, product:slug-name
<entity>:list:<filter> → products:list:category-5
<entity>:<id>:<relation> → user:123:orders
aggregate:<entity>:<filter> → aggregate:orders:user-123-totalDocument the convention in a comment at the top of your cache module:
typescript
// Cache key conventions:
// user:{id} — single user record
// users:list — paginated user list (invalidate on any user change)
// user:{id}:orders — orders belonging to a user在代码库中使用统一的键命名方案:
<实体>:<标识符> → user:123, product:slug-name
<实体>:list:<筛选条件> → products:list:category-5
<实体>:<id>:<关联关系> → user:123:orders
aggregate:<实体>:<筛选条件> → aggregate:orders:user-123-total在缓存模块顶部的注释中记录该规范:
typescript
// 缓存键命名规范:
// user:{id} — 单个用户记录
// users:list — 分页用户列表(任意用户变更时失效)
// user:{id}:orders — 用户所属的订单Step 6: Cache warming (optional)
步骤6:缓存预热(可选)
For critical data that must never have a cold-start miss, add a warm-up step on app boot:
typescript
async function warmCache() {
const activeUsers = await db.users.findActiveUsers();
await Promise.all(
activeUsers.map(u => redis.set(`user:${u.id}`, JSON.stringify(u)))
);
}对于绝对不能出现冷启动未命中的关键数据,在应用启动时添加预热步骤:
typescript
async function warmCache() {
const activeUsers = await db.users.findActiveUsers();
await Promise.all(
activeUsers.map(u => redis.set(`user:${u.id}`, JSON.stringify(u)))
);
}Step 7: Summary
步骤7:总结
Report what was implemented:
undefined报告已实现的内容:
undefinedCache Strategy Applied
已应用的缓存策略
Cache layer: Redis (permanent, no TTL)
缓存层:Redis(永久缓存,无TTL)
Reads cached (X total)
已缓存的读取操作(共X个)
- user:{id} — UserService.getUser()
- product:{slug} — ProductService.findBySlug()
- ...
- user:{id} — UserService.getUser()
- product:{slug} — ProductService.findBySlug()
- ...
Invalidation points added (Y total)
添加的失效点(共Y个)
- user:{id} — UserService.updateUser(), UserService.deleteUser()
- product:{slug} — ProductService.updateProduct()
- ...
- user:{id} — UserService.updateUser(), UserService.deleteUser()
- product:{slug} — ProductService.updateProduct()
- ...
Estimated DB load reduction
预估数据库负载降低情况
Before: ~X DB queries/min
After: ~Y DB queries/min (first-read only, then cache)
Note: for data that comes from third-party APIs or sources where you can't intercept writes, use a long TTL (hours, not minutes) as a fallback — but flag these separately so they can be revisited if an invalidation webhook becomes available.之前: ~X 次数据库查询/分钟
之后: ~Y 次数据库查询/分钟(仅首次读取查询数据库,后续从缓存读取)
注意:对于来自第三方API或无法拦截写入操作的数据源,可使用长TTL(小时级,而非分钟级)作为备选方案——但需单独标记这些场景,以便后续若有失效webhook可用时重新评估。