spice-caching

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Spice Caching

Spice缓存

Configure in-memory caching for SQL query results, search results, and embeddings in the Spice runtime.

在Spice运行时中为SQL查询结果、搜索结果和向量嵌入配置内存缓存。

Overview

概述

Spice caches results from SQL queries (

/v1/sql

), search (

/v1/search

), and embeddings requests. All three caches are enabled by default with a 1-second TTL and 128 MiB max size. Caching applies to HTTP and Arrow Flight APIs.

Spice会缓存SQL查询（

/v1/sql

）、搜索（

/v1/search

）和向量嵌入请求的结果。这三种缓存默认均已启用，TTL为1秒，最大容量为128 MiB。缓存功能适用于HTTP和Arrow Flight API。

Configuration

配置

Caching is configured under

runtime.caching

spicepod.yaml

yaml

version: v1
kind: Spicepod
name: app

runtime:
  caching:
    sql_results:
      enabled: true
      max_size: 1GiB # Default 128MiB
      item_ttl: 1m # Default 1s
      eviction_policy: lru # lru | tiny_lfu
      hashing_algorithm: xxh3
      cache_key_type: plan # plan | sql
      encoding: none # none | zstd
      stale_while_revalidate_ttl: 30s # Default 0s (disabled)
    search_results:
      enabled: true
      max_size: 1GiB
      item_ttl: 1m
      eviction_policy: lru
    embeddings:
      enabled: true
      max_size: 128MiB
      item_ttl: 1m

缓存配置在

spicepod.yaml

的

runtime.caching

项下：

yaml

version: v1
kind: Spicepod
name: app

runtime:
  caching:
    sql_results:
      enabled: true
      max_size: 1GiB # 默认128MiB
      item_ttl: 1m # 默认1s
      eviction_policy: lru # lru | tiny_lfu
      hashing_algorithm: xxh3
      cache_key_type: plan # plan | sql
      encoding: none # none | zstd
      stale_while_revalidate_ttl: 30s # 默认0s（已禁用）
    search_results:
      enabled: true
      max_size: 1GiB
      item_ttl: 1m
      eviction_policy: lru
    embeddings:
      enabled: true
      max_size: 128MiB
      item_ttl: 1m

Common Parameters (All Cache Types)

通用参数（所有缓存类型）

Parameter	Default	Description
`enabled`	`true`	Enable/disable the cache
`max_size`	`128MiB`	Maximum cache size
`eviction_policy`	`lru`	`lru` (Least Recently Used) or `tiny_lfu` (higher hit rate for skewed access)
`item_ttl`	`1s`	Cache entry TTL (Time to Live)
`hashing_algorithm`	`xxh3`	Hash for cache keys: `xxh3` , `ahash` , `siphash` , `blake3` , `xxh32` , `xxh64` , `xxh128`

参数	默认值	说明
`enabled`	`true`	启用/禁用缓存
`max_size`	`128MiB`	缓存最大容量
`eviction_policy`	`lru`	`lru` （最近最少使用）或 `tiny_lfu` （在访问倾斜场景下命中率更高）
`item_ttl`	`1s`	缓存条目的TTL（存活时间）
`hashing_algorithm`	`xxh3`	缓存键的哈希算法： `xxh3` 、 `ahash` 、 `siphash` 、 `blake3` 、 `xxh32` 、 `xxh64` 、 `xxh128`

SQL Results Extra Parameters

SQL结果额外参数

Parameter	Default	Description
`cache_key_type`	`plan`	`plan` = logical plan (matches semantically equivalent queries); `sql` = raw SQL string (faster, exact match only)
`encoding`	`none`	`none` or `zstd` (compresses cached results, 50-90% reduction)
`stale_while_revalidate_ttl`	`0s`	Serve stale entries while refreshing in background. `0s` = disabled

参数	默认值	说明
`cache_key_type`	`plan`	`plan` = 逻辑执行计划（匹配语义等价的查询，即使SQL语法不同）； `sql` = 原始SQL字符串（更快，仅精确匹配）
`encoding`	`none`	`none` 或 `zstd` （压缩缓存结果，可减少50-90%的体积）
`stale_while_revalidate_ttl`	`0s`	在后台刷新缓存时返回过期条目。 `0s` 表示禁用此功能

Choosing Parameters

参数选择指南

cache_key_type

cache_key_type

plan
(default): Matches semantically equivalent queries even with different SQL syntax. Requires query parsing overhead.
sql
: Faster lookups, exact string match. Avoid with dynamic functions like
```
NOW()
```
.

plan
（默认）：即使SQL语法不同，也能匹配语义等价的查询。但需要额外的查询解析开销。
sql
：查找速度更快，仅匹配完全相同的字符串。当查询包含
```
NOW()
```
等动态函数时避免使用。

eviction_policy

eviction_policy

lru
(default): Good general-purpose policy.
tiny_lfu
: Better hit rate when some queries are accessed much more frequently than others.

lru
（默认）：通用场景下表现良好的策略。
tiny_lfu
：当部分查询的访问频率远高于其他查询时，命中率更高。

encoding

encoding

none
(default): Zero compression overhead, uses more memory.
zstd
: High compression (50-90% reduction) with fast decompression. Use for large result sets.

none
（默认）：无压缩开销，但占用更多内存。
zstd
：高压缩比（50-90%）且解压速度快。适用于大型结果集。

hashing_algorithm

hashing_algorithm

xxh3
(default): Fastest general-purpose.
ahash
/ xxh64
/ xxh128
: Lower collision probability for many cached queries.
blake3
: Cryptographic security required.
siphash
: Protection against hash-flooding DoS attacks.

xxh3
（默认）：速度最快的通用哈希算法。
ahash
/ xxh64
/ xxh128
：在缓存大量查询时，碰撞概率更低。
blake3
：适用于需要加密安全性的场景。
siphash
：可防范哈希洪水DoS攻击。

Stale-While-Revalidate

Stale-While-Revalidate机制

When

stale_while_revalidate_ttl

is set to a non-zero value:

Cache entries are served normally until
```
item_ttl
```
expires.
After
```
item_ttl
```
expires but before
```
item_ttl + stale_while_revalidate_ttl
```
, the stale entry is served immediately with
```
STALE
```
status.
A background task refreshes the cache entry.
After
```
item_ttl + stale_while_revalidate_ttl
```
, the entry is evicted.

yaml

runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s
      # Fresh for 10s → Stale (served while refreshing) for 10s → Evicted

Conflict warning: When using
refresh_mode: caching
on a dataset, do not configure both
runtime.caching.sql_results.stale_while_revalidate_ttl
and
acceleration.params.caching_stale_while_revalidate_ttl
for the same dataset. Choose one approach.

当

stale_while_revalidate_ttl

设置为非零值时：

缓存条目在
```
item_ttl
```
过期前正常返回。
在
```
item_ttl
```
过期后但未超过
```
item_ttl + stale_while_revalidate_ttl
```
的时间段内，会立即返回过期条目，并标记
```
STALE
```
状态，同时后台刷新缓存条目。
超过
```
item_ttl + stale_while_revalidate_ttl
```
后，条目会被淘汰。

yaml

runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s
      # 新鲜状态10秒 → 过期但可返回（后台刷新）10秒 → 被淘汰

冲突警告：当对数据集使用
refresh_mode: caching
时，请勿同时为同一数据集配置
runtime.caching.sql_results.stale_while_revalidate_ttl
和
acceleration.params.caching_stale_while_revalidate_ttl
。请选择其中一种方式。

Cache Control Headers

缓存控制标头

HTTP API

Use the standard

Cache-Control

header with

/v1/sql

and

/v1/search

Directive	Description
`no-cache`	Skip cache for this request; cache the result for future requests
`min-fresh=N`	Require cached entry to remain fresh for at least N seconds
`max-stale=N`	Accept stale responses up to N seconds old
`only-if-cached`	Return only cached responses; error on cache miss
`stale-if-error=N`	Serve stale cache (up to N seconds) if fetching fresh data fails

bash

undefined

在

/v1/sql

和

/v1/search

接口中使用标准

Cache-Control

标头：

指令	说明
`no-cache`	本次请求跳过缓存；将结果缓存供未来请求使用
`min-fresh=N`	要求缓存条目至少还有N秒的新鲜期
`max-stale=N`	接受最多过期N秒的响应
`only-if-cached`	仅返回缓存中的响应；缓存未命中时返回错误
`stale-if-error=N`	如果获取新鲜数据失败，返回最多过期N秒的缓存内容

bash

undefined

Skip cache for this query

本次查询跳过缓存

curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Only accept fresh results (at least 30s remaining)

仅接受新鲜结果（至少剩余30秒新鲜期）

curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Accept stale up to 60s

接受最多过期60秒的结果

curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Only return if cached

仅返回缓存中的结果

curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

undefined

curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

undefined

Spice CLI

bash

spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache

bash

spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache

Arrow FlightSQL

Set

cache-control

in request metadata:

rust

let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");

JDBC:

java

Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);

在请求元数据中设置

cache-control

：

rust

let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");

JDBC:

java

Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);

Custom Cache Keys

自定义缓存键

Set the

Spice-Cache-Key

header to share cache entries across semantically equivalent but syntactically different queries. Valid keys: up to 128 alphanumeric characters plus

and

. Custom keys take precedence over

cache_key_type

bash

undefined

设置

Spice-Cache-Key

标头，可在语义等价但语法不同的查询之间共享缓存条目。键的有效值：最多128个字母数字字符，可包含

和

。自定义缓存键的优先级高于

cache_key_type

。

bash

undefined

First query — cache MISS

第一次查询 — 缓存未命中

curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where org_id = 1;"

Different query, same cache key — cache HIT

不同的查询，相同的缓存键 — 缓存命中

curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"


> **Warning**: Ensure queries sharing a cache key are truly semantically equivalent. The runtime will return the cached result regardless of the actual query.

curl -XPOST http://localhost:8090/v1/sql
-H "spice-cache-key: users_spiceai"
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"


> **警告**：确保共享同一缓存键的查询确实是语义等价的。运行时将直接返回缓存结果，而不检查实际查询内容。

Response Headers

响应标头

Responses include a header indicating cache status:

Cache Type	Response Header
`sql_results`	`Results-Cache-Status`
`search_results`	`Search-Results-Cache-Status`

Status	Meaning
`HIT`	Served from cache
`MISS`	Cache checked, result not found
`BYPASS`	Cache bypassed (e.g., `cache-control: no-cache` )
`STALE`	Stale entry served while revalidating
(absent)	Cache did not apply (disabled or system table query)

响应中会包含一个标头，指示缓存状态：

缓存类型	响应标头
`sql_results`	`Results-Cache-Status`
`search_results`	`Search-Results-Cache-Status`

状态	含义
`HIT`	从缓存返回结果
`MISS`	检查了缓存，但未找到对应结果
`BYPASS`	绕过了缓存（例如使用了 `cache-control: no-cache` ）
`STALE`	返回了过期条目，同时后台正在刷新缓存
（不存在）	缓存未生效（缓存已禁用或查询的是系统表）

Monitoring / Metrics

监控/指标

Cache metrics are available at the Prometheus-compatible metrics endpoint. Prefix by cache type:

results_*

search_results_*

embeddings_*

Metric	Type	Description
`*_cache_max_size_bytes`	Gauge	Configured max cache size
`*_cache_requests`	Counter	Total cache lookups
`*_cache_hits`	Counter	Total cache hits
`*_cache_items_count`	Gauge	Current items in cache
`*_cache_size_bytes`	Gauge	Current cache size
`*_cache_evictions`	Counter	Total evictions
`*_cache_hit_ratio`	Gauge	Hit ratio (hits / total)

缓存指标可在兼容Prometheus的指标端点获取。按缓存类型前缀区分：

results_*

、

search_results_*

、

embeddings_*

。

指标	类型	说明
`*_cache_max_size_bytes`	计量指标	配置的缓存最大容量
`*_cache_requests`	计数器	缓存查询总次数
`*_cache_hits`	计数器	缓存命中总次数
`*_cache_items_count`	计量指标	当前缓存中的条目数
`*_cache_size_bytes`	计量指标	当前缓存占用的容量
`*_cache_evictions`	计数器	缓存条目被淘汰的总次数
`*_cache_hit_ratio`	计量指标	缓存命中率（命中次数/总查询次数）

Common Recipes

常见配置示例

High-throughput Dashboard (Maximize Hit Rate)

高吞吐量仪表板（最大化命中率）

yaml

runtime:
  caching:
    sql_results:
      item_ttl: 30s
      max_size: 2GiB
      eviction_policy: tiny_lfu
      encoding: zstd
      stale_while_revalidate_ttl: 30s

yaml

runtime:
  caching:
    sql_results:
      item_ttl: 30s
      max_size: 2GiB
      eviction_policy: tiny_lfu
      encoding: zstd
      stale_while_revalidate_ttl: 30s

Low-Latency API (Exact Queries, Fast Lookups)

低延迟API（精确查询，快速查找）

yaml

runtime:
  caching:
    sql_results:
      item_ttl: 5s
      cache_key_type: sql
      hashing_algorithm: xxh3

yaml

runtime:
  caching:
    sql_results:
      item_ttl: 5s
      cache_key_type: sql
      hashing_algorithm: xxh3

Disable Caching Entirely

完全禁用缓存

yaml

runtime:
  caching:
    sql_results:
      enabled: false
    search_results:
      enabled: false
    embeddings:
      enabled: false

yaml

runtime:
  caching:
    sql_results:
      enabled: false
    search_results:
      enabled: false
    embeddings:
      enabled: false

Troubleshooting

故障排除

Issue	Solution
Always getting `MISS`	Check `item_ttl` is long enough; verify `cache_key_type` ( `plan` matches equivalent queries, `sql` requires exact strings)
Cache filling up quickly	Increase `max_size` , enable `zstd` encoding, or reduce `item_ttl`
Stale data being served	Reduce `item_ttl` or `stale_while_revalidate_ttl` ; use `cache-control: no-cache` for specific queries
Dynamic functions ( `NOW()` ) returning cached results	Switch to `cache_key_type: plan` or use `cache-control: no-cache`
SWR conflict error	Don't set both `runtime.caching.sql_results.stale_while_revalidate_ttl` and `acceleration.params.caching_stale_while_revalidate_ttl` for the same dataset

问题	解决方案
始终返回 `MISS`	检查 `item_ttl` 是否足够长；验证 `cache_key_type` 的设置（ `plan` 匹配等价查询， `sql` 要求精确字符串匹配）
缓存快速被填满	增大 `max_size` 、启用 `zstd` 压缩或缩短 `item_ttl`
返回过期数据	缩短 `item_ttl` 或 `stale_while_revalidate_ttl` ；对特定查询使用 `cache-control: no-cache`
包含动态函数（如 `NOW()` ）的查询返回缓存结果	切换为 `cache_key_type: plan` 或使用 `cache-control: no-cache`
出现SWR冲突错误	请勿为同一数据集同时设置 `runtime.caching.sql_results.stale_while_revalidate_ttl` 和 `acceleration.params.caching_stale_while_revalidate_ttl`