stash-ai-memory
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseStash AI Memory
Stash AI 内存
Skill by ara.so — Daily 2026 Skills collection.
Stash is a self-hosted persistent memory layer for AI agents. It stores episodes, facts, and working context in Postgres with pgvector, runs an 8-stage consolidation pipeline to turn raw observations into structured knowledge, and exposes everything via an MCP server that works with any MCP-compatible agent (Claude Desktop, Cursor, Windsurf, Cline, Continue, OpenAI Agents, Ollama, OpenRouter).
由ara.so开发的Skill — 属于Daily 2026 Skills合集。
Stash是一款为AI Agent打造的自托管型持久化内存层。它将事件、事实和工作上下文存储在搭载pgvector的Postgres中,运行8阶段整合流水线将原始观测数据转换为结构化知识,并通过MCP服务器对外暴露所有功能,可兼容任意支持MCP的Agent(Claude Desktop、Cursor、Windsurf、Cline、Continue、OpenAI Agents、Ollama、OpenRouter)。
Architecture
架构
Agent ──► MCP Server ──► Postgres + pgvector
│
└──► Background Consolidation Pipeline
(Episodes → Facts → Relationships →
Causal Links → Goals → Failures →
Hypotheses → Confidence Decay)Agent ──► MCP Server ──► Postgres + pgvector
│
└──► Background Consolidation Pipeline
(Episodes → Facts → Relationships →
Causal Links → Goals → Failures →
Hypotheses → Confidence Decay)Quick Start (Docker — Recommended)
快速开始(Docker — 推荐方式)
bash
git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .envbash
git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .envEdit .env with your LLM API key and model
Edit .env with your LLM API key and model
docker compose up
This starts Postgres with pgvector, runs migrations, and launches the MCP server with background consolidation.docker compose up
此命令会启动搭载pgvector的Postgres,执行数据库迁移,并启动带有后台整合流水线的MCP服务器。Environment Configuration
环境配置
bash
undefinedbash
undefined.env
.env
LLM provider (OpenAI-compatible endpoint)
LLM provider (OpenAI-compatible endpoint)
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=$OPENAI_API_KEY
LLM_MODEL=gpt-4o-mini
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=$OPENAI_API_KEY
LLM_MODEL=gpt-4o-mini
Or use Ollama (local)
Or use Ollama (local)
LLM_BASE_URL=http://localhost:11434/v1
LLM_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=ollama
LLM_API_KEY=ollama
LLM_MODEL=llama3.2
LLM_MODEL=llama3.2
Or OpenRouter
Or OpenRouter
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_API_KEY=$OPENROUTER_API_KEY
LLM_API_KEY=$OPENROUTER_API_KEY
LLM_MODEL=anthropic/claude-3-haiku
LLM_MODEL=anthropic/claude-3-haiku
Postgres connection
Postgres connection
DATABASE_URL=postgres://stash:stash@localhost:5432/stash?sslmode=disable
DATABASE_URL=postgres://stash:stash@localhost:5432/stash?sslmode=disable
MCP server
MCP server
MCP_SERVER_ADDR=:8080
MCP_SERVER_ADDR=:8080
Consolidation pipeline interval
Consolidation pipeline interval
CONSOLIDATION_INTERVAL=5m
undefinedCONSOLIDATION_INTERVAL=5m
undefinedBinary / Manual Install
二进制文件 / 手动安装
bash
git clone https://github.com/alash3al/stash.git
cd stashbash
git clone https://github.com/alash3al/stash.git
cd stashBuild the binary
Build the binary
go build -o stash ./cmd/stash
go build -o stash ./cmd/stash
Run migrations and start server
Run migrations and start server
./stash serve
undefined./stash serve
undefinedConnecting MCP Clients
连接MCP客户端
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json
)
~/Library/Application Support/Claude/claude_desktop_config.jsonClaude Desktop(~/Library/Application Support/Claude/claude_desktop_config.json
)
~/Library/Application Support/Claude/claude_desktop_config.jsonjson
{
"mcpServers": {
"stash": {
"url": "http://localhost:8080/mcp",
"transport": "http"
}
}
}json
{
"mcpServers": {
"stash": {
"url": "http://localhost:8080/mcp",
"transport": "http"
}
}
}Cursor / Windsurf / Cline (.cursor/mcp.json
or equivalent)
.cursor/mcp.jsonCursor / Windsurf / Cline(.cursor/mcp.json
或等效配置文件)
.cursor/mcp.jsonjson
{
"mcpServers": {
"stash": {
"url": "http://localhost:8080/mcp",
"transport": "http"
}
}
}json
{
"mcpServers": {
"stash": {
"url": "http://localhost:8080/mcp",
"transport": "http"
}
}
}Continue (~/.continue/config.json
)
~/.continue/config.jsonContinue(~/.continue/config.json
)
~/.continue/config.jsonjson
{
"experimental": {
"modelContextProtocolServers": [
{
"transport": {
"type": "http",
"url": "http://localhost:8080/mcp"
}
}
]
}
}json
{
"experimental": {
"modelContextProtocolServers": [
{
"transport": {
"type": "http",
"url": "http://localhost:8080/mcp"
}
}
]
}
}MCP Tools Exposed to Agents
向Agent暴露的MCP工具
Stash exposes these tools via MCP that agents call automatically:
| Tool | Purpose |
|---|---|
| Store an episode or observation |
| Semantic search across memory |
| Query consolidated facts |
| Get/set working context |
| Remove specific memories |
Stash通过MCP向Agent暴露以下工具,Agent可自动调用:
| 工具 | 用途 |
|---|---|
| 存储事件或观测数据 |
| 对内存进行语义搜索 |
| 查询已整合的事实 |
| 获取/设置工作上下文 |
| 删除特定内存 |
Using Stash Programmatically (Go)
编程式使用Stash(Go语言)
go
package main
import (
"context"
"fmt"
"log"
"github.com/alash3al/stash/pkg/client"
)
func main() {
c, err := client.New(client.Config{
BaseURL: "http://localhost:8080",
})
if err != nil {
log.Fatal(err)
}
ctx := context.Background()
// Store an episode
err = c.Remember(ctx, client.Episode{
AgentID: "my-agent",
Content: "User prefers dark mode and uses vim keybindings",
Tags: []string{"preferences", "ui"},
})
if err != nil {
log.Fatal(err)
}
// Recall relevant memories
results, err := c.Recall(ctx, client.RecallQuery{
AgentID: "my-agent",
Query: "what are the user's editor preferences?",
Limit: 5,
})
if err != nil {
log.Fatal(err)
}
for _, r := range results {
fmt.Printf("[%.2f] %s\n", r.Score, r.Content)
}
}go
package main
import (
"context"
"fmt"
"log"
"github.com/alash3al/stash/pkg/client"
)
func main() {
c, err := client.New(client.Config{
BaseURL: "http://localhost:8080",
})
if err != nil {
log.Fatal(err)
}
ctx := context.Background()
// Store an episode
err = c.Remember(ctx, client.Episode{
AgentID: "my-agent",
Content: "User prefers dark mode and uses vim keybindings",
Tags: []string{"preferences", "ui"},
})
if err != nil {
log.Fatal(err)
}
// Recall relevant memories
results, err := c.Recall(ctx, client.RecallQuery{
AgentID: "my-agent",
Query: "what are the user's editor preferences?",
Limit: 5,
})
if err != nil {
log.Fatal(err)
}
for _, r := range results {
fmt.Printf("[%.2f] %s\n", r.Score, r.Content)
}
}Docker Compose (Full Reference)
Docker Compose(完整参考)
yaml
undefinedyaml
undefineddocker-compose.yml (from repo)
docker-compose.yml (from repo)
services:
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_USER: stash
POSTGRES_PASSWORD: stash
POSTGRES_DB: stash
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U stash"]
interval: 5s
timeout: 5s
retries: 5
stash:
build: .
env_file: .env
environment:
DATABASE_URL: postgres://stash:stash@postgres:5432/stash?sslmode=disable
ports:
- "8080:8080"
depends_on:
postgres:
condition: service_healthy
volumes:
pgdata:
undefinedservices:
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_USER: stash
POSTGRES_PASSWORD: stash
POSTGRES_DB: stash
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U stash"]
interval: 5s
timeout: 5s
retries: 5
stash:
build: .
env_file: .env
environment:
DATABASE_URL: postgres://stash:stash@postgres:5432/stash?sslmode=disable
ports:
- "8080:8080"
depends_on:
postgres:
condition: service_healthy
volumes:
pgdata:
undefinedConsolidation Pipeline
整合流水线
The 8-stage pipeline runs on a configurable interval (default 5 minutes) and processes only new data since the last run:
- Episodes — raw observations stored by agents
- Facts — discrete true/false statements extracted from episodes
- Relationships — links between facts and entities
- Causal Links — cause-and-effect patterns
- Goal Tracking — inferred agent/user goals
- Failure Patterns — what went wrong and why
- Hypothesis Verification — testing inferred beliefs against new data
- Confidence Decay — reducing confidence in stale/unconfirmed facts
To trigger consolidation manually (if supported):
bash
curl -X POST http://localhost:8080/consolidate8阶段流水线按可配置的时间间隔运行(默认5分钟),仅处理上次运行后新增的数据:
- 事件 — Agent存储的原始观测数据
- 事实 — 从事件中提取的离散真假陈述
- 关联关系 — 事实与实体之间的链接
- 因果关联 — 因果模式
- 目标追踪 — 推断出的Agent/用户目标
- 失败模式 — 问题及原因分析
- 假设验证 — 用新数据测试推断出的结论
- 置信度衰减 — 降低陈旧/未确认事实的置信度
手动触发整合(若支持):
bash
curl -X POST http://localhost:8080/consolidateWorking Context API
工作上下文API
Working context is a scratchpad for in-flight session state:
bash
undefined工作上下文是用于临时会话状态的草稿本:
bash
undefinedSet context
设置上下文
curl -X PUT http://localhost:8080/api/context/my-agent
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'
curl -X PUT http://localhost:8080/api/context/my-agent
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'
Get context
获取上下文
undefinedundefinedCommon Patterns
常见使用模式
Pattern 1: Agent with Memory in Python (via MCP HTTP)
模式1:带内存的Python Agent(通过MCP HTTP)
python
import requests
STASH_URL = "http://localhost:8080"
def remember(agent_id: str, content: str, tags: list[str] = None):
requests.post(f"{STASH_URL}/api/episodes", json={
"agent_id": agent_id,
"content": content,
"tags": tags or [],
})
def recall(agent_id: str, query: str, limit: int = 5) -> list[dict]:
r = requests.post(f"{STASH_URL}/api/recall", json={
"agent_id": agent_id,
"query": query,
"limit": limit,
})
return r.json().get("results", [])python
import requests
STASH_URL = "http://localhost:8080"
def remember(agent_id: str, content: str, tags: list[str] = None):
requests.post(f"{STASH_URL}/api/episodes", json={
"agent_id": agent_id,
"content": content,
"tags": tags or [],
})
def recall(agent_id: str, query: str, limit: int = 5) -> list[dict]:
r = requests.post(f"{STASH_URL}/api/recall", json={
"agent_id": agent_id,
"query": query,
"limit": limit,
})
return r.json().get("results", [])Usage
Usage
remember("assistant-1", "User is building a Go microservice with gRPC")
memories = recall("assistant-1", "what is the user working on?")
for m in memories:
print(f"[{m['score']:.2f}] {m['content']}")
undefinedremember("assistant-1", "User is building a Go microservice with gRPC")
memories = recall("assistant-1", "what is the user working on?")
for m in memories:
print(f"[{m['score']:.2f}] {m['content']}")
undefinedPattern 2: Injecting Memory into System Prompt
模式2:将内存注入系统提示词
python
def build_system_prompt(agent_id: str, base_prompt: str, user_message: str) -> str:
memories = recall(agent_id, user_message, limit=10)
if not memories:
return base_prompt
memory_block = "\n".join(f"- {m['content']}" for m in memories)
return f"""{base_prompt}python
def build_system_prompt(agent_id: str, base_prompt: str, user_message: str) -> str:
memories = recall(agent_id, user_message, limit=10)
if not memories:
return base_prompt
memory_block = "\n".join(f"- {m['content']}" for m in memories)
return f"""{base_prompt}Relevant Memory
Relevant Memory
{memory_block}
"""
undefined{memory_block}
"""
undefinedPattern 3: OpenAI Agents SDK Integration
模式3:OpenAI Agents SDK集成
python
from agents import Agent, Runner
from agents.mcp import MCPServerHTTP
stash_mcp = MCPServerHTTP(url="http://localhost:8080/mcp")
agent = Agent(
name="my-agent",
instructions="You have persistent memory. Use stash tools to remember and recall.",
mcp_servers=[stash_mcp],
)
result = Runner.run_sync(agent, "What do you remember about my coding preferences?")
print(result.final_output)python
from agents import Agent, Runner
from agents.mcp import MCPServerHTTP
stash_mcp = MCPServerHTTP(url="http://localhost:8080/mcp")
agent = Agent(
name="my-agent",
instructions="You have persistent memory. Use stash tools to remember and recall.",
mcp_servers=[stash_mcp],
)
result = Runner.run_sync(agent, "What do you remember about my coding preferences?")
print(result.final_output)Troubleshooting
故障排查
Postgres connection refused
Postgres连接被拒绝
bash
undefinedbash
undefinedCheck pgvector extension is available
检查pgvector扩展是否可用
docker exec -it stash-postgres-1 psql -U stash -c "SELECT * FROM pg_extension WHERE extname='vector';"
docker exec -it stash-postgres-1 psql -U stash -c "SELECT * FROM pg_extension WHERE extname='vector';"
If missing, install it
若缺失则安装
docker exec -it stash-postgres-1 psql -U stash -c "CREATE EXTENSION vector;"
undefineddocker exec -it stash-postgres-1 psql -U stash -c "CREATE EXTENSION vector;"
undefinedMCP server not reachable from Claude Desktop
Claude Desktop无法连接MCP服务器
- Ensure is accessible (not
http://localhost:8080/mcp)https - Check Claude Desktop supports HTTP MCP transport (requires Claude Desktop ≥ 0.10)
- Try to verify the server is up
curl http://localhost:8080/mcp
- 确保可访问(不要使用
http://localhost:8080/mcp)https - 检查Claude Desktop是否支持HTTP MCP传输(要求Claude Desktop ≥ 0.10版本)
- 尝试执行验证服务器是否正常运行
curl http://localhost:8080/mcp
Consolidation not running
整合流水线未运行
bash
undefinedbash
undefinedCheck logs for consolidation pipeline errors
检查整合流水线的日志错误
docker compose logs stash | grep -i consolidat
docker compose logs stash | grep -i consolidat
Verify LLM credentials are correct — consolidation uses the LLM to extract facts
验证LLM凭证是否正确 — 整合过程需要LLM提取事实
curl $LLM_BASE_URL/models -H "Authorization: Bearer $LLM_API_KEY"
undefinedcurl $LLM_BASE_URL/models -H "Authorization: Bearer $LLM_API_KEY"
undefinedEmbedding/recall returning no results
嵌入/召回无结果返回
- Consolidation may not have run yet (wait one interval or trigger manually)
- Verify the LLM model supports embeddings or that a separate embedding model is configured
- Check that episodes were actually stored:
curl http://localhost:8080/api/episodes?agent_id=my-agent
- 整合可能尚未运行(等待一个间隔周期或手动触发)
- 验证LLM模型是否支持嵌入,或是否已配置独立的嵌入模型
- 检查事件是否已实际存储:
curl http://localhost:8080/api/episodes?agent_id=my-agent
Resetting all memory
重置所有内存
bash
undefinedbash
undefinedNuclear option: wipe and restart
终极方案:清空并重启
docker compose down -v
docker compose up
undefineddocker compose down -v
docker compose up
undefinedKey Endpoints Reference
核心端点参考
| Method | Path | Description |
|---|---|---|
| | Store a new episode |
| | Semantic recall query |
| | List consolidated facts |
| | Working context |
| | Forget an episode |
| | Trigger consolidation manually |
| | Health check |
| | MCP protocol endpoint |
| 请求方法 | 路径 | 描述 |
|---|---|---|
| | 存储新事件 |
| | 语义召回查询 |
| | 列出已整合的事实 |
| | 工作上下文 |
| | 删除指定事件 |
| | 手动触发整合 |
| | 健康检查 |
| | MCP协议端点 |
Self-Hosting Checklist
自托管检查清单
- Postgres 16+ with pgvector extension
- LLM API key with access to a chat-completion model
- Port 8080 accessible to your MCP clients
- Volume mounted for Postgres data persistence
- tuned to your usage (default
CONSOLIDATION_INTERVAL)5m - Agent IDs are consistent across sessions for memory continuity
- 搭载pgvector扩展的Postgres 16+版本
- 拥有聊天补全模型访问权限的LLM API密钥
- MCP客户端可访问8080端口
- 已挂载Postgres数据持久化卷
- 根据使用场景调整(默认
CONSOLIDATION_INTERVAL)5m - 跨会话使用一致的Agent ID以保证内存连续性