spice-caching

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Spice Caching

Spice缓存

Configure in-memory caching for SQL query results, search results, and embeddings in the Spice runtime.
在Spice运行时中为SQL查询结果、搜索结果和向量嵌入配置内存缓存。

Overview

概述

Spice caches results from SQL queries (
/v1/sql
), search (
/v1/search
), and embeddings requests. All three caches are enabled by default with a 1-second TTL and 128 MiB max size. Caching applies to HTTP and Arrow Flight APIs.
Spice会缓存SQL查询(
/v1/sql
)、搜索(
/v1/search
)和向量嵌入请求的结果。这三种缓存默认均已启用,TTL为1秒,最大容量为128 MiB。缓存功能适用于HTTP和Arrow Flight API。

Configuration

配置

Caching is configured under
runtime.caching
in
spicepod.yaml
:
yaml
version: v1
kind: Spicepod
name: app

runtime:
  caching:
    sql_results:
      enabled: true
      max_size: 1GiB # Default 128MiB
      item_ttl: 1m # Default 1s
      eviction_policy: lru # lru | tiny_lfu
      hashing_algorithm: xxh3
      cache_key_type: plan # plan | sql
      encoding: none # none | zstd
      stale_while_revalidate_ttl: 30s # Default 0s (disabled)
    search_results:
      enabled: true
      max_size: 1GiB
      item_ttl: 1m
      eviction_policy: lru
    embeddings:
      enabled: true
      max_size: 128MiB
      item_ttl: 1m
缓存配置在
spicepod.yaml
runtime.caching
项下:
yaml
version: v1
kind: Spicepod
name: app

runtime:
  caching:
    sql_results:
      enabled: true
      max_size: 1GiB # 默认128MiB
      item_ttl: 1m # 默认1s
      eviction_policy: lru # lru | tiny_lfu
      hashing_algorithm: xxh3
      cache_key_type: plan # plan | sql
      encoding: none # none | zstd
      stale_while_revalidate_ttl: 30s # 默认0s(已禁用)
    search_results:
      enabled: true
      max_size: 1GiB
      item_ttl: 1m
      eviction_policy: lru
    embeddings:
      enabled: true
      max_size: 128MiB
      item_ttl: 1m

Common Parameters (All Cache Types)

通用参数(所有缓存类型)

ParameterDefaultDescription
enabled
true
Enable/disable the cache
max_size
128MiB
Maximum cache size
eviction_policy
lru
lru
(Least Recently Used) or
tiny_lfu
(higher hit rate for skewed access)
item_ttl
1s
Cache entry TTL (Time to Live)
hashing_algorithm
xxh3
Hash for cache keys:
xxh3
,
ahash
,
siphash
,
blake3
,
xxh32
,
xxh64
,
xxh128
参数默认值说明
enabled
true
启用/禁用缓存
max_size
128MiB
缓存最大容量
eviction_policy
lru
lru
(最近最少使用)或
tiny_lfu
(在访问倾斜场景下命中率更高)
item_ttl
1s
缓存条目的TTL(存活时间)
hashing_algorithm
xxh3
缓存键的哈希算法:
xxh3
ahash
siphash
blake3
xxh32
xxh64
xxh128

SQL Results Extra Parameters

SQL结果额外参数

ParameterDefaultDescription
cache_key_type
plan
plan
= logical plan (matches semantically equivalent queries);
sql
= raw SQL string (faster, exact match only)
encoding
none
none
or
zstd
(compresses cached results, 50-90% reduction)
stale_while_revalidate_ttl
0s
Serve stale entries while refreshing in background.
0s
= disabled
参数默认值说明
cache_key_type
plan
plan
= 逻辑执行计划(匹配语义等价的查询,即使SQL语法不同);
sql
= 原始SQL字符串(更快,仅精确匹配)
encoding
none
none
zstd
(压缩缓存结果,可减少50-90%的体积)
stale_while_revalidate_ttl
0s
在后台刷新缓存时返回过期条目。
0s
表示禁用此功能

Choosing Parameters

参数选择指南

cache_key_type

cache_key_type

  • plan
    (default): Matches semantically equivalent queries even with different SQL syntax. Requires query parsing overhead.
  • sql
    : Faster lookups, exact string match. Avoid with dynamic functions like
    NOW()
    .
  • plan
    (默认):即使SQL语法不同,也能匹配语义等价的查询。但需要额外的查询解析开销。
  • sql
    :查找速度更快,仅匹配完全相同的字符串。当查询包含
    NOW()
    等动态函数时避免使用。

eviction_policy

eviction_policy

  • lru
    (default): Good general-purpose policy.
  • tiny_lfu
    : Better hit rate when some queries are accessed much more frequently than others.
  • lru
    (默认):通用场景下表现良好的策略。
  • tiny_lfu
    :当部分查询的访问频率远高于其他查询时,命中率更高。

encoding

encoding

  • none
    (default): Zero compression overhead, uses more memory.
  • zstd
    : High compression (50-90% reduction) with fast decompression. Use for large result sets.
  • none
    (默认):无压缩开销,但占用更多内存。
  • zstd
    :高压缩比(50-90%)且解压速度快。适用于大型结果集。

hashing_algorithm

hashing_algorithm

  • xxh3
    (default): Fastest general-purpose.
  • ahash
    /
    xxh64
    /
    xxh128
    : Lower collision probability for many cached queries.
  • blake3
    : Cryptographic security required.
  • siphash
    : Protection against hash-flooding DoS attacks.
  • xxh3
    (默认):速度最快的通用哈希算法。
  • ahash
    /
    xxh64
    /
    xxh128
    :在缓存大量查询时,碰撞概率更低。
  • blake3
    :适用于需要加密安全性的场景。
  • siphash
    :可防范哈希洪水DoS攻击。

Stale-While-Revalidate

Stale-While-Revalidate机制

When
stale_while_revalidate_ttl
is set to a non-zero value:
  1. Cache entries are served normally until
    item_ttl
    expires.
  2. After
    item_ttl
    expires but before
    item_ttl + stale_while_revalidate_ttl
    , the stale entry is served immediately with
    STALE
    status.
  3. A background task refreshes the cache entry.
  4. After
    item_ttl + stale_while_revalidate_ttl
    , the entry is evicted.
yaml
runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s
      # Fresh for 10s → Stale (served while refreshing) for 10s → Evicted
Conflict warning: When using
refresh_mode: caching
on a dataset, do not configure both
runtime.caching.sql_results.stale_while_revalidate_ttl
and
acceleration.params.caching_stale_while_revalidate_ttl
for the same dataset. Choose one approach.
stale_while_revalidate_ttl
设置为非零值时:
  1. 缓存条目在
    item_ttl
    过期前正常返回。
  2. item_ttl
    过期后但未超过
    item_ttl + stale_while_revalidate_ttl
    的时间段内,会立即返回过期条目,并标记
    STALE
    状态,同时后台刷新缓存条目。
  3. 超过
    item_ttl + stale_while_revalidate_ttl
    后,条目会被淘汰。
yaml
runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s
      # 新鲜状态10秒 → 过期但可返回(后台刷新)10秒 → 被淘汰
冲突警告:当对数据集使用
refresh_mode: caching
时,请勿同时为同一数据集配置
runtime.caching.sql_results.stale_while_revalidate_ttl
acceleration.params.caching_stale_while_revalidate_ttl
。请选择其中一种方式。

Cache Control Headers

缓存控制标头

HTTP API

HTTP API

Use the standard
Cache-Control
header with
/v1/sql
and
/v1/search
:
DirectiveDescription
no-cache
Skip cache for this request; cache the result for future requests
min-fresh=N
Require cached entry to remain fresh for at least N seconds
max-stale=N
Accept stale responses up to N seconds old
only-if-cached
Return only cached responses; error on cache miss
stale-if-error=N
Serve stale cache (up to N seconds) if fetching fresh data fails
bash
undefined
/v1/sql
/v1/search
接口中使用标准
Cache-Control
标头:
指令说明
no-cache
本次请求跳过缓存;将结果缓存供未来请求使用
min-fresh=N
要求缓存条目至少还有N秒的新鲜期
max-stale=N
接受最多过期N秒的响应
only-if-cached
仅返回缓存中的响应;缓存未命中时返回错误
stale-if-error=N
如果获取新鲜数据失败,返回最多过期N秒的缓存内容
bash
undefined

Skip cache for this query

本次查询跳过缓存

curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Only accept fresh results (at least 30s remaining)

仅接受新鲜结果(至少剩余30秒新鲜期)

curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Accept stale up to 60s

接受最多过期60秒的结果

curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Only return if cached

仅返回缓存中的结果

curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
undefined
curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
undefined

Spice CLI

Spice CLI

bash
spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache
bash
spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache

Arrow FlightSQL

Arrow FlightSQL

Set
cache-control
in request metadata:
rust
let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");
JDBC:
java
Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);
在请求元数据中设置
cache-control
rust
let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");
JDBC:
java
Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);

Custom Cache Keys

自定义缓存键

Set the
Spice-Cache-Key
header to share cache entries across semantically equivalent but syntactically different queries. Valid keys: up to 128 alphanumeric characters plus
-
and
_
. Custom keys take precedence over
cache_key_type
.
bash
undefined
设置
Spice-Cache-Key
标头,可在语义等价但语法不同的查询之间共享缓存条目。键的有效值:最多128个字母数字字符,可包含
-
_
。自定义缓存键的优先级高于
cache_key_type
bash
undefined

First query — cache MISS

第一次查询 — 缓存未命中

curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"
curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"

Different query, same cache key — cache HIT

不同的查询,相同的缓存键 — 缓存命中

curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"

> **Warning**: Ensure queries sharing a cache key are truly semantically equivalent. The runtime will return the cached result regardless of the actual query.
curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"

> **警告**:确保共享同一缓存键的查询确实是语义等价的。运行时将直接返回缓存结果,而不检查实际查询内容。

Response Headers

响应标头

Responses include a header indicating cache status:
Cache TypeResponse Header
sql_results
Results-Cache-Status
search_results
Search-Results-Cache-Status
StatusMeaning
HIT
Served from cache
MISS
Cache checked, result not found
BYPASS
Cache bypassed (e.g.,
cache-control: no-cache
)
STALE
Stale entry served while revalidating
(absent)Cache did not apply (disabled or system table query)
响应中会包含一个标头,指示缓存状态:
缓存类型响应标头
sql_results
Results-Cache-Status
search_results
Search-Results-Cache-Status
状态含义
HIT
从缓存返回结果
MISS
检查了缓存,但未找到对应结果
BYPASS
绕过了缓存(例如使用了
cache-control: no-cache
STALE
返回了过期条目,同时后台正在刷新缓存
(不存在)缓存未生效(缓存已禁用或查询的是系统表)

Monitoring / Metrics

监控/指标

Cache metrics are available at the Prometheus-compatible metrics endpoint. Prefix by cache type:
results_*
,
search_results_*
,
embeddings_*
.
MetricTypeDescription
*_cache_max_size_bytes
GaugeConfigured max cache size
*_cache_requests
CounterTotal cache lookups
*_cache_hits
CounterTotal cache hits
*_cache_items_count
GaugeCurrent items in cache
*_cache_size_bytes
GaugeCurrent cache size
*_cache_evictions
CounterTotal evictions
*_cache_hit_ratio
GaugeHit ratio (hits / total)
缓存指标可在兼容Prometheus的指标端点获取。按缓存类型前缀区分:
results_*
search_results_*
embeddings_*
指标类型说明
*_cache_max_size_bytes
计量指标配置的缓存最大容量
*_cache_requests
计数器缓存查询总次数
*_cache_hits
计数器缓存命中总次数
*_cache_items_count
计量指标当前缓存中的条目数
*_cache_size_bytes
计量指标当前缓存占用的容量
*_cache_evictions
计数器缓存条目被淘汰的总次数
*_cache_hit_ratio
计量指标缓存命中率(命中次数/总查询次数)

Common Recipes

常见配置示例

High-throughput Dashboard (Maximize Hit Rate)

高吞吐量仪表板(最大化命中率)

yaml
runtime:
  caching:
    sql_results:
      item_ttl: 30s
      max_size: 2GiB
      eviction_policy: tiny_lfu
      encoding: zstd
      stale_while_revalidate_ttl: 30s
yaml
runtime:
  caching:
    sql_results:
      item_ttl: 30s
      max_size: 2GiB
      eviction_policy: tiny_lfu
      encoding: zstd
      stale_while_revalidate_ttl: 30s

Low-Latency API (Exact Queries, Fast Lookups)

低延迟API(精确查询,快速查找)

yaml
runtime:
  caching:
    sql_results:
      item_ttl: 5s
      cache_key_type: sql
      hashing_algorithm: xxh3
yaml
runtime:
  caching:
    sql_results:
      item_ttl: 5s
      cache_key_type: sql
      hashing_algorithm: xxh3

Disable Caching Entirely

完全禁用缓存

yaml
runtime:
  caching:
    sql_results:
      enabled: false
    search_results:
      enabled: false
    embeddings:
      enabled: false
yaml
runtime:
  caching:
    sql_results:
      enabled: false
    search_results:
      enabled: false
    embeddings:
      enabled: false

Troubleshooting

故障排除

IssueSolution
Always getting
MISS
Check
item_ttl
is long enough; verify
cache_key_type
(
plan
matches equivalent queries,
sql
requires exact strings)
Cache filling up quicklyIncrease
max_size
, enable
zstd
encoding, or reduce
item_ttl
Stale data being servedReduce
item_ttl
or
stale_while_revalidate_ttl
; use
cache-control: no-cache
for specific queries
Dynamic functions (
NOW()
) returning cached results
Switch to
cache_key_type: plan
or use
cache-control: no-cache
SWR conflict errorDon't set both
runtime.caching.sql_results.stale_while_revalidate_ttl
and
acceleration.params.caching_stale_while_revalidate_ttl
for the same dataset
问题解决方案
始终返回
MISS
检查
item_ttl
是否足够长;验证
cache_key_type
的设置(
plan
匹配等价查询,
sql
要求精确字符串匹配)
缓存快速被填满增大
max_size
、启用
zstd
压缩或缩短
item_ttl
返回过期数据缩短
item_ttl
stale_while_revalidate_ttl
;对特定查询使用
cache-control: no-cache
包含动态函数(如
NOW()
)的查询返回缓存结果
切换为
cache_key_type: plan
或使用
cache-control: no-cache
出现SWR冲突错误请勿为同一数据集同时设置
runtime.caching.sql_results.stale_while_revalidate_ttl
acceleration.params.caching_stale_while_revalidate_ttl