stash-ai-memory

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Stash AI Memory

Stash AI 内存

Skill by ara.so — Daily 2026 Skills collection.
Stash is a self-hosted persistent memory layer for AI agents. It stores episodes, facts, and working context in Postgres with pgvector, runs an 8-stage consolidation pipeline to turn raw observations into structured knowledge, and exposes everything via an MCP server that works with any MCP-compatible agent (Claude Desktop, Cursor, Windsurf, Cline, Continue, OpenAI Agents, Ollama, OpenRouter).
ara.so开发的Skill — 属于Daily 2026 Skills合集。
Stash是一款为AI Agent打造的自托管型持久化内存层。它将事件、事实和工作上下文存储在搭载pgvector的Postgres中,运行8阶段整合流水线将原始观测数据转换为结构化知识,并通过MCP服务器对外暴露所有功能,可兼容任意支持MCP的Agent(Claude Desktop、Cursor、Windsurf、Cline、Continue、OpenAI Agents、Ollama、OpenRouter)。

Architecture

架构

Agent ──► MCP Server ──► Postgres + pgvector
                └──► Background Consolidation Pipeline
                     (Episodes → Facts → Relationships →
                      Causal Links → Goals → Failures →
                      Hypotheses → Confidence Decay)
Agent ──► MCP Server ──► Postgres + pgvector
                └──► Background Consolidation Pipeline
                     (Episodes → Facts → Relationships →
                      Causal Links → Goals → Failures →
                      Hypotheses → Confidence Decay)

Quick Start (Docker — Recommended)

快速开始(Docker — 推荐方式)

bash
git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .env
bash
git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .env

Edit .env with your LLM API key and model

Edit .env with your LLM API key and model

docker compose up

This starts Postgres with pgvector, runs migrations, and launches the MCP server with background consolidation.
docker compose up

此命令会启动搭载pgvector的Postgres,执行数据库迁移,并启动带有后台整合流水线的MCP服务器。

Environment Configuration

环境配置

bash
undefined
bash
undefined

.env

.env

LLM provider (OpenAI-compatible endpoint)

LLM provider (OpenAI-compatible endpoint)

LLM_BASE_URL=https://api.openai.com/v1 LLM_API_KEY=$OPENAI_API_KEY LLM_MODEL=gpt-4o-mini
LLM_BASE_URL=https://api.openai.com/v1 LLM_API_KEY=$OPENAI_API_KEY LLM_MODEL=gpt-4o-mini

Or use Ollama (local)

Or use Ollama (local)

LLM_API_KEY=ollama

LLM_API_KEY=ollama

LLM_MODEL=llama3.2

LLM_MODEL=llama3.2

Or OpenRouter

Or OpenRouter

LLM_API_KEY=$OPENROUTER_API_KEY

LLM_API_KEY=$OPENROUTER_API_KEY

LLM_MODEL=anthropic/claude-3-haiku

LLM_MODEL=anthropic/claude-3-haiku

Postgres connection

Postgres connection

DATABASE_URL=postgres://stash:stash@localhost:5432/stash?sslmode=disable
DATABASE_URL=postgres://stash:stash@localhost:5432/stash?sslmode=disable

MCP server

MCP server

MCP_SERVER_ADDR=:8080
MCP_SERVER_ADDR=:8080

Consolidation pipeline interval

Consolidation pipeline interval

CONSOLIDATION_INTERVAL=5m
undefined
CONSOLIDATION_INTERVAL=5m
undefined

Binary / Manual Install

二进制文件 / 手动安装

bash
git clone https://github.com/alash3al/stash.git
cd stash
bash
git clone https://github.com/alash3al/stash.git
cd stash

Build the binary

Build the binary

go build -o stash ./cmd/stash
go build -o stash ./cmd/stash

Run migrations and start server

Run migrations and start server

./stash serve
undefined
./stash serve
undefined

Connecting MCP Clients

连接MCP客户端

Claude Desktop (
~/Library/Application Support/Claude/claude_desktop_config.json
)

Claude Desktop(
~/Library/Application Support/Claude/claude_desktop_config.json

json
{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}
json
{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}

Cursor / Windsurf / Cline (
.cursor/mcp.json
or equivalent)

Cursor / Windsurf / Cline(
.cursor/mcp.json
或等效配置文件)

json
{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}
json
{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}

Continue (
~/.continue/config.json
)

Continue(
~/.continue/config.json

json
{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "http",
          "url": "http://localhost:8080/mcp"
        }
      }
    ]
  }
}
json
{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "http",
          "url": "http://localhost:8080/mcp"
        }
      }
    ]
  }
}

MCP Tools Exposed to Agents

向Agent暴露的MCP工具

Stash exposes these tools via MCP that agents call automatically:
ToolPurpose
stash_remember
Store an episode or observation
stash_recall
Semantic search across memory
stash_facts
Query consolidated facts
stash_context
Get/set working context
stash_forget
Remove specific memories
Stash通过MCP向Agent暴露以下工具,Agent可自动调用:
工具用途
stash_remember
存储事件或观测数据
stash_recall
对内存进行语义搜索
stash_facts
查询已整合的事实
stash_context
获取/设置工作上下文
stash_forget
删除特定内存

Using Stash Programmatically (Go)

编程式使用Stash(Go语言)

go
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/alash3al/stash/pkg/client"
)

func main() {
    c, err := client.New(client.Config{
        BaseURL: "http://localhost:8080",
    })
    if err != nil {
        log.Fatal(err)
    }

    ctx := context.Background()

    // Store an episode
    err = c.Remember(ctx, client.Episode{
        AgentID: "my-agent",
        Content: "User prefers dark mode and uses vim keybindings",
        Tags:    []string{"preferences", "ui"},
    })
    if err != nil {
        log.Fatal(err)
    }

    // Recall relevant memories
    results, err := c.Recall(ctx, client.RecallQuery{
        AgentID: "my-agent",
        Query:   "what are the user's editor preferences?",
        Limit:   5,
    })
    if err != nil {
        log.Fatal(err)
    }

    for _, r := range results {
        fmt.Printf("[%.2f] %s\n", r.Score, r.Content)
    }
}
go
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/alash3al/stash/pkg/client"
)

func main() {
    c, err := client.New(client.Config{
        BaseURL: "http://localhost:8080",
    })
    if err != nil {
        log.Fatal(err)
    }

    ctx := context.Background()

    // Store an episode
    err = c.Remember(ctx, client.Episode{
        AgentID: "my-agent",
        Content: "User prefers dark mode and uses vim keybindings",
        Tags:    []string{"preferences", "ui"},
    })
    if err != nil {
        log.Fatal(err)
    }

    // Recall relevant memories
    results, err := c.Recall(ctx, client.RecallQuery{
        AgentID: "my-agent",
        Query:   "what are the user's editor preferences?",
        Limit:   5,
    })
    if err != nil {
        log.Fatal(err)
    }

    for _, r := range results {
        fmt.Printf("[%.2f] %s\n", r.Score, r.Content)
    }
}

Docker Compose (Full Reference)

Docker Compose(完整参考)

yaml
undefined
yaml
undefined

docker-compose.yml (from repo)

docker-compose.yml (from repo)

services: postgres: image: pgvector/pgvector:pg16 environment: POSTGRES_USER: stash POSTGRES_PASSWORD: stash POSTGRES_DB: stash volumes: - pgdata:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U stash"] interval: 5s timeout: 5s retries: 5
stash: build: . env_file: .env environment: DATABASE_URL: postgres://stash:stash@postgres:5432/stash?sslmode=disable ports: - "8080:8080" depends_on: postgres: condition: service_healthy
volumes: pgdata:
undefined
services: postgres: image: pgvector/pgvector:pg16 environment: POSTGRES_USER: stash POSTGRES_PASSWORD: stash POSTGRES_DB: stash volumes: - pgdata:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U stash"] interval: 5s timeout: 5s retries: 5
stash: build: . env_file: .env environment: DATABASE_URL: postgres://stash:stash@postgres:5432/stash?sslmode=disable ports: - "8080:8080" depends_on: postgres: condition: service_healthy
volumes: pgdata:
undefined

Consolidation Pipeline

整合流水线

The 8-stage pipeline runs on a configurable interval (default 5 minutes) and processes only new data since the last run:
  1. Episodes — raw observations stored by agents
  2. Facts — discrete true/false statements extracted from episodes
  3. Relationships — links between facts and entities
  4. Causal Links — cause-and-effect patterns
  5. Goal Tracking — inferred agent/user goals
  6. Failure Patterns — what went wrong and why
  7. Hypothesis Verification — testing inferred beliefs against new data
  8. Confidence Decay — reducing confidence in stale/unconfirmed facts
To trigger consolidation manually (if supported):
bash
curl -X POST http://localhost:8080/consolidate
8阶段流水线按可配置的时间间隔运行(默认5分钟),仅处理上次运行后新增的数据:
  1. 事件 — Agent存储的原始观测数据
  2. 事实 — 从事件中提取的离散真假陈述
  3. 关联关系 — 事实与实体之间的链接
  4. 因果关联 — 因果模式
  5. 目标追踪 — 推断出的Agent/用户目标
  6. 失败模式 — 问题及原因分析
  7. 假设验证 — 用新数据测试推断出的结论
  8. 置信度衰减 — 降低陈旧/未确认事实的置信度
手动触发整合(若支持):
bash
curl -X POST http://localhost:8080/consolidate

Working Context API

工作上下文API

Working context is a scratchpad for in-flight session state:
bash
undefined
工作上下文是用于临时会话状态的草稿本:
bash
undefined

Set context

设置上下文

curl -X PUT http://localhost:8080/api/context/my-agent
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'
curl -X PUT http://localhost:8080/api/context/my-agent
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'

Get context

获取上下文

Common Patterns

常见使用模式

Pattern 1: Agent with Memory in Python (via MCP HTTP)

模式1:带内存的Python Agent(通过MCP HTTP)

python
import requests

STASH_URL = "http://localhost:8080"

def remember(agent_id: str, content: str, tags: list[str] = None):
    requests.post(f"{STASH_URL}/api/episodes", json={
        "agent_id": agent_id,
        "content": content,
        "tags": tags or [],
    })

def recall(agent_id: str, query: str, limit: int = 5) -> list[dict]:
    r = requests.post(f"{STASH_URL}/api/recall", json={
        "agent_id": agent_id,
        "query": query,
        "limit": limit,
    })
    return r.json().get("results", [])
python
import requests

STASH_URL = "http://localhost:8080"

def remember(agent_id: str, content: str, tags: list[str] = None):
    requests.post(f"{STASH_URL}/api/episodes", json={
        "agent_id": agent_id,
        "content": content,
        "tags": tags or [],
    })

def recall(agent_id: str, query: str, limit: int = 5) -> list[dict]:
    r = requests.post(f"{STASH_URL}/api/recall", json={
        "agent_id": agent_id,
        "query": query,
        "limit": limit,
    })
    return r.json().get("results", [])

Usage

Usage

remember("assistant-1", "User is building a Go microservice with gRPC") memories = recall("assistant-1", "what is the user working on?") for m in memories: print(f"[{m['score']:.2f}] {m['content']}")
undefined
remember("assistant-1", "User is building a Go microservice with gRPC") memories = recall("assistant-1", "what is the user working on?") for m in memories: print(f"[{m['score']:.2f}] {m['content']}")
undefined

Pattern 2: Injecting Memory into System Prompt

模式2:将内存注入系统提示词

python
def build_system_prompt(agent_id: str, base_prompt: str, user_message: str) -> str:
    memories = recall(agent_id, user_message, limit=10)
    if not memories:
        return base_prompt

    memory_block = "\n".join(f"- {m['content']}" for m in memories)
    return f"""{base_prompt}
python
def build_system_prompt(agent_id: str, base_prompt: str, user_message: str) -> str:
    memories = recall(agent_id, user_message, limit=10)
    if not memories:
        return base_prompt

    memory_block = "\n".join(f"- {m['content']}" for m in memories)
    return f"""{base_prompt}

Relevant Memory

Relevant Memory

{memory_block} """
undefined
{memory_block} """
undefined

Pattern 3: OpenAI Agents SDK Integration

模式3:OpenAI Agents SDK集成

python
from agents import Agent, Runner
from agents.mcp import MCPServerHTTP

stash_mcp = MCPServerHTTP(url="http://localhost:8080/mcp")

agent = Agent(
    name="my-agent",
    instructions="You have persistent memory. Use stash tools to remember and recall.",
    mcp_servers=[stash_mcp],
)

result = Runner.run_sync(agent, "What do you remember about my coding preferences?")
print(result.final_output)
python
from agents import Agent, Runner
from agents.mcp import MCPServerHTTP

stash_mcp = MCPServerHTTP(url="http://localhost:8080/mcp")

agent = Agent(
    name="my-agent",
    instructions="You have persistent memory. Use stash tools to remember and recall.",
    mcp_servers=[stash_mcp],
)

result = Runner.run_sync(agent, "What do you remember about my coding preferences?")
print(result.final_output)

Troubleshooting

故障排查

Postgres connection refused

Postgres连接被拒绝

bash
undefined
bash
undefined

Check pgvector extension is available

检查pgvector扩展是否可用

docker exec -it stash-postgres-1 psql -U stash -c "SELECT * FROM pg_extension WHERE extname='vector';"
docker exec -it stash-postgres-1 psql -U stash -c "SELECT * FROM pg_extension WHERE extname='vector';"

If missing, install it

若缺失则安装

docker exec -it stash-postgres-1 psql -U stash -c "CREATE EXTENSION vector;"
undefined
docker exec -it stash-postgres-1 psql -U stash -c "CREATE EXTENSION vector;"
undefined

MCP server not reachable from Claude Desktop

Claude Desktop无法连接MCP服务器

  • Ensure
    http://localhost:8080/mcp
    is accessible (not
    https
    )
  • Check Claude Desktop supports HTTP MCP transport (requires Claude Desktop ≥ 0.10)
  • Try
    curl http://localhost:8080/mcp
    to verify the server is up
  • 确保
    http://localhost:8080/mcp
    可访问(不要使用
    https
  • 检查Claude Desktop是否支持HTTP MCP传输(要求Claude Desktop ≥ 0.10版本)
  • 尝试执行
    curl http://localhost:8080/mcp
    验证服务器是否正常运行

Consolidation not running

整合流水线未运行

bash
undefined
bash
undefined

Check logs for consolidation pipeline errors

检查整合流水线的日志错误

docker compose logs stash | grep -i consolidat
docker compose logs stash | grep -i consolidat

Verify LLM credentials are correct — consolidation uses the LLM to extract facts

验证LLM凭证是否正确 — 整合过程需要LLM提取事实

curl $LLM_BASE_URL/models -H "Authorization: Bearer $LLM_API_KEY"
undefined
curl $LLM_BASE_URL/models -H "Authorization: Bearer $LLM_API_KEY"
undefined

Embedding/recall returning no results

嵌入/召回无结果返回

  • Consolidation may not have run yet (wait one interval or trigger manually)
  • Verify the LLM model supports embeddings or that a separate embedding model is configured
  • Check that episodes were actually stored:
    curl http://localhost:8080/api/episodes?agent_id=my-agent
  • 整合可能尚未运行(等待一个间隔周期或手动触发)
  • 验证LLM模型是否支持嵌入,或是否已配置独立的嵌入模型
  • 检查事件是否已实际存储:
    curl http://localhost:8080/api/episodes?agent_id=my-agent

Resetting all memory

重置所有内存

bash
undefined
bash
undefined

Nuclear option: wipe and restart

终极方案:清空并重启

docker compose down -v docker compose up
undefined
docker compose down -v docker compose up
undefined

Key Endpoints Reference

核心端点参考

MethodPathDescription
POST
/api/episodes
Store a new episode
POST
/api/recall
Semantic recall query
GET
/api/facts
List consolidated facts
GET/PUT
/api/context/:agent_id
Working context
DELETE
/api/episodes/:id
Forget an episode
POST
/consolidate
Trigger consolidation manually
GET
/health
Health check
*
/mcp
MCP protocol endpoint
请求方法路径描述
POST
/api/episodes
存储新事件
POST
/api/recall
语义召回查询
GET
/api/facts
列出已整合的事实
GET/PUT
/api/context/:agent_id
工作上下文
DELETE
/api/episodes/:id
删除指定事件
POST
/consolidate
手动触发整合
GET
/health
健康检查
*
/mcp
MCP协议端点

Self-Hosting Checklist

自托管检查清单

  • Postgres 16+ with pgvector extension
  • LLM API key with access to a chat-completion model
  • Port 8080 accessible to your MCP clients
  • Volume mounted for Postgres data persistence
  • CONSOLIDATION_INTERVAL
    tuned to your usage (default
    5m
    )
  • Agent IDs are consistent across sessions for memory continuity
  • 搭载pgvector扩展的Postgres 16+版本
  • 拥有聊天补全模型访问权限的LLM API密钥
  • MCP客户端可访问8080端口
  • 已挂载Postgres数据持久化卷
  • 根据使用场景调整
    CONSOLIDATION_INTERVAL
    (默认
    5m
  • 跨会话使用一致的Agent ID以保证内存连续性