spice-caching
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSpice Caching
Spice缓存
Configure in-memory caching for SQL query results, search results, and embeddings in the Spice runtime.
在Spice运行时中为SQL查询结果、搜索结果和向量嵌入配置内存缓存。
Overview
概述
Spice caches results from SQL queries (), search (), and embeddings requests. All three caches are enabled by default with a 1-second TTL and 128 MiB max size. Caching applies to HTTP and Arrow Flight APIs.
/v1/sql/v1/searchSpice会缓存SQL查询()、搜索()和向量嵌入请求的结果。这三种缓存默认均已启用,TTL为1秒,最大容量为128 MiB。缓存功能适用于HTTP和Arrow Flight API。
/v1/sql/v1/searchConfiguration
配置
Caching is configured under in :
runtime.cachingspicepod.yamlyaml
version: v1
kind: Spicepod
name: app
runtime:
caching:
sql_results:
enabled: true
max_size: 1GiB # Default 128MiB
item_ttl: 1m # Default 1s
eviction_policy: lru # lru | tiny_lfu
hashing_algorithm: xxh3
cache_key_type: plan # plan | sql
encoding: none # none | zstd
stale_while_revalidate_ttl: 30s # Default 0s (disabled)
search_results:
enabled: true
max_size: 1GiB
item_ttl: 1m
eviction_policy: lru
embeddings:
enabled: true
max_size: 128MiB
item_ttl: 1m缓存配置在的项下:
spicepod.yamlruntime.cachingyaml
version: v1
kind: Spicepod
name: app
runtime:
caching:
sql_results:
enabled: true
max_size: 1GiB # 默认128MiB
item_ttl: 1m # 默认1s
eviction_policy: lru # lru | tiny_lfu
hashing_algorithm: xxh3
cache_key_type: plan # plan | sql
encoding: none # none | zstd
stale_while_revalidate_ttl: 30s # 默认0s(已禁用)
search_results:
enabled: true
max_size: 1GiB
item_ttl: 1m
eviction_policy: lru
embeddings:
enabled: true
max_size: 128MiB
item_ttl: 1mCommon Parameters (All Cache Types)
通用参数(所有缓存类型)
| Parameter | Default | Description |
|---|---|---|
| | Enable/disable the cache |
| | Maximum cache size |
| | |
| | Cache entry TTL (Time to Live) |
| | Hash for cache keys: |
| 参数 | 默认值 | 说明 |
|---|---|---|
| | 启用/禁用缓存 |
| | 缓存最大容量 |
| | |
| | 缓存条目的TTL(存活时间) |
| | 缓存键的哈希算法: |
SQL Results Extra Parameters
SQL结果额外参数
| Parameter | Default | Description |
|---|---|---|
| | |
| | |
| | Serve stale entries while refreshing in background. |
| 参数 | 默认值 | 说明 |
|---|---|---|
| | |
| | |
| | 在后台刷新缓存时返回过期条目。 |
Choosing Parameters
参数选择指南
cache_key_type
cache_key_typecache_key_type
cache_key_type- (default): Matches semantically equivalent queries even with different SQL syntax. Requires query parsing overhead.
plan - : Faster lookups, exact string match. Avoid with dynamic functions like
sql.NOW()
- (默认):即使SQL语法不同,也能匹配语义等价的查询。但需要额外的查询解析开销。
plan - :查找速度更快,仅匹配完全相同的字符串。当查询包含
sql等动态函数时避免使用。NOW()
eviction_policy
eviction_policyeviction_policy
eviction_policy- (default): Good general-purpose policy.
lru - : Better hit rate when some queries are accessed much more frequently than others.
tiny_lfu
- (默认):通用场景下表现良好的策略。
lru - :当部分查询的访问频率远高于其他查询时,命中率更高。
tiny_lfu
encoding
encodingencoding
encoding- (default): Zero compression overhead, uses more memory.
none - : High compression (50-90% reduction) with fast decompression. Use for large result sets.
zstd
- (默认):无压缩开销,但占用更多内存。
none - :高压缩比(50-90%)且解压速度快。适用于大型结果集。
zstd
hashing_algorithm
hashing_algorithmhashing_algorithm
hashing_algorithm- (default): Fastest general-purpose.
xxh3 - /
ahash/xxh64: Lower collision probability for many cached queries.xxh128 - : Cryptographic security required.
blake3 - : Protection against hash-flooding DoS attacks.
siphash
- (默认):速度最快的通用哈希算法。
xxh3 - /
ahash/xxh64:在缓存大量查询时,碰撞概率更低。xxh128 - :适用于需要加密安全性的场景。
blake3 - :可防范哈希洪水DoS攻击。
siphash
Stale-While-Revalidate
Stale-While-Revalidate机制
When is set to a non-zero value:
stale_while_revalidate_ttl- Cache entries are served normally until expires.
item_ttl - After expires but before
item_ttl, the stale entry is served immediately withitem_ttl + stale_while_revalidate_ttlstatus.STALE - A background task refreshes the cache entry.
- After , the entry is evicted.
item_ttl + stale_while_revalidate_ttl
yaml
runtime:
caching:
sql_results:
enabled: true
item_ttl: 10s
stale_while_revalidate_ttl: 10s
# Fresh for 10s → Stale (served while refreshing) for 10s → EvictedConflict warning: When usingon a dataset, do not configure bothrefresh_mode: cachingandruntime.caching.sql_results.stale_while_revalidate_ttlfor the same dataset. Choose one approach.acceleration.params.caching_stale_while_revalidate_ttl
当设置为非零值时:
stale_while_revalidate_ttl- 缓存条目在过期前正常返回。
item_ttl - 在过期后但未超过
item_ttl的时间段内,会立即返回过期条目,并标记item_ttl + stale_while_revalidate_ttl状态,同时后台刷新缓存条目。STALE - 超过后,条目会被淘汰。
item_ttl + stale_while_revalidate_ttl
yaml
runtime:
caching:
sql_results:
enabled: true
item_ttl: 10s
stale_while_revalidate_ttl: 10s
# 新鲜状态10秒 → 过期但可返回(后台刷新)10秒 → 被淘汰冲突警告:当对数据集使用时,请勿同时为同一数据集配置refresh_mode: caching和runtime.caching.sql_results.stale_while_revalidate_ttl。请选择其中一种方式。acceleration.params.caching_stale_while_revalidate_ttl
Cache Control Headers
缓存控制标头
HTTP API
HTTP API
Use the standard header with and :
Cache-Control/v1/sql/v1/search| Directive | Description |
|---|---|
| Skip cache for this request; cache the result for future requests |
| Require cached entry to remain fresh for at least N seconds |
| Accept stale responses up to N seconds old |
| Return only cached responses; error on cache miss |
| Serve stale cache (up to N seconds) if fetching fresh data fails |
bash
undefined在和接口中使用标准标头:
/v1/sql/v1/searchCache-Control| 指令 | 说明 |
|---|---|
| 本次请求跳过缓存;将结果缓存供未来请求使用 |
| 要求缓存条目至少还有N秒的新鲜期 |
| 接受最多过期N秒的响应 |
| 仅返回缓存中的响应;缓存未命中时返回错误 |
| 如果获取新鲜数据失败,返回最多过期N秒的缓存内容 |
bash
undefinedSkip cache for this query
本次查询跳过缓存
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
Only accept fresh results (at least 30s remaining)
仅接受新鲜结果(至少剩余30秒新鲜期)
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
Accept stale up to 60s
接受最多过期60秒的结果
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
Only return if cached
仅返回缓存中的结果
curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
undefinedcurl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
undefinedSpice CLI
Spice CLI
bash
spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cachebash
spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cacheArrow FlightSQL
Arrow FlightSQL
Set in request metadata:
cache-controlrust
let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");JDBC:
java
Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);在请求元数据中设置:
cache-controlrust
let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");JDBC:
java
Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);Custom Cache Keys
自定义缓存键
Set the header to share cache entries across semantically equivalent but syntactically different queries. Valid keys: up to 128 alphanumeric characters plus and . Custom keys take precedence over .
Spice-Cache-Key-_cache_key_typebash
undefined设置标头,可在语义等价但语法不同的查询之间共享缓存条目。键的有效值:最多128个字母数字字符,可包含和。自定义缓存键的优先级高于。
Spice-Cache-Key-_cache_key_typebash
undefinedFirst query — cache MISS
第一次查询 — 缓存未命中
curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"
curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"
Different query, same cache key — cache HIT
不同的查询,相同的缓存键 — 缓存命中
curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"
> **Warning**: Ensure queries sharing a cache key are truly semantically equivalent. The runtime will return the cached result regardless of the actual query.curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"
> **警告**:确保共享同一缓存键的查询确实是语义等价的。运行时将直接返回缓存结果,而不检查实际查询内容。Response Headers
响应标头
Responses include a header indicating cache status:
| Cache Type | Response Header |
|---|---|
| |
| |
| Status | Meaning |
|---|---|
| Served from cache |
| Cache checked, result not found |
| Cache bypassed (e.g., |
| Stale entry served while revalidating |
| (absent) | Cache did not apply (disabled or system table query) |
响应中会包含一个标头,指示缓存状态:
| 缓存类型 | 响应标头 |
|---|---|
| |
| |
| 状态 | 含义 |
|---|---|
| 从缓存返回结果 |
| 检查了缓存,但未找到对应结果 |
| 绕过了缓存(例如使用了 |
| 返回了过期条目,同时后台正在刷新缓存 |
| (不存在) | 缓存未生效(缓存已禁用或查询的是系统表) |
Monitoring / Metrics
监控/指标
Cache metrics are available at the Prometheus-compatible metrics endpoint. Prefix by cache type: , , .
results_*search_results_*embeddings_*| Metric | Type | Description |
|---|---|---|
| Gauge | Configured max cache size |
| Counter | Total cache lookups |
| Counter | Total cache hits |
| Gauge | Current items in cache |
| Gauge | Current cache size |
| Counter | Total evictions |
| Gauge | Hit ratio (hits / total) |
缓存指标可在兼容Prometheus的指标端点获取。按缓存类型前缀区分:、、。
results_*search_results_*embeddings_*| 指标 | 类型 | 说明 |
|---|---|---|
| 计量指标 | 配置的缓存最大容量 |
| 计数器 | 缓存查询总次数 |
| 计数器 | 缓存命中总次数 |
| 计量指标 | 当前缓存中的条目数 |
| 计量指标 | 当前缓存占用的容量 |
| 计数器 | 缓存条目被淘汰的总次数 |
| 计量指标 | 缓存命中率(命中次数/总查询次数) |
Common Recipes
常见配置示例
High-throughput Dashboard (Maximize Hit Rate)
高吞吐量仪表板(最大化命中率)
yaml
runtime:
caching:
sql_results:
item_ttl: 30s
max_size: 2GiB
eviction_policy: tiny_lfu
encoding: zstd
stale_while_revalidate_ttl: 30syaml
runtime:
caching:
sql_results:
item_ttl: 30s
max_size: 2GiB
eviction_policy: tiny_lfu
encoding: zstd
stale_while_revalidate_ttl: 30sLow-Latency API (Exact Queries, Fast Lookups)
低延迟API(精确查询,快速查找)
yaml
runtime:
caching:
sql_results:
item_ttl: 5s
cache_key_type: sql
hashing_algorithm: xxh3yaml
runtime:
caching:
sql_results:
item_ttl: 5s
cache_key_type: sql
hashing_algorithm: xxh3Disable Caching Entirely
完全禁用缓存
yaml
runtime:
caching:
sql_results:
enabled: false
search_results:
enabled: false
embeddings:
enabled: falseyaml
runtime:
caching:
sql_results:
enabled: false
search_results:
enabled: false
embeddings:
enabled: falseTroubleshooting
故障排除
| Issue | Solution |
|---|---|
Always getting | Check |
| Cache filling up quickly | Increase |
| Stale data being served | Reduce |
Dynamic functions ( | Switch to |
| SWR conflict error | Don't set both |
| 问题 | 解决方案 |
|---|---|
始终返回 | 检查 |
| 缓存快速被填满 | 增大 |
| 返回过期数据 | 缩短 |
包含动态函数(如 | 切换为 |
| 出现SWR冲突错误 | 请勿为同一数据集同时设置 |