stash-ai-memory

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Stash AI Memory

Stash AI 内存

Skill by ara.so — Daily 2026 Skills collection.

Stash is a self-hosted persistent memory layer for AI agents. It stores episodes, facts, and working context in Postgres with pgvector, runs an 8-stage consolidation pipeline to turn raw observations into structured knowledge, and exposes everything via an MCP server that works with any MCP-compatible agent (Claude Desktop, Cursor, Windsurf, Cline, Continue, OpenAI Agents, Ollama, OpenRouter).

由ara.so开发的Skill — 属于Daily 2026 Skills合集。

Stash是一款为AI Agent打造的自托管型持久化内存层。它将事件、事实和工作上下文存储在搭载pgvector的Postgres中，运行8阶段整合流水线将原始观测数据转换为结构化知识，并通过MCP服务器对外暴露所有功能，可兼容任意支持MCP的Agent（Claude Desktop、Cursor、Windsurf、Cline、Continue、OpenAI Agents、Ollama、OpenRouter）。

Architecture

架构

Agent ──► MCP Server ──► Postgres + pgvector
                │
                └──► Background Consolidation Pipeline
                     (Episodes → Facts → Relationships →
                      Causal Links → Goals → Failures →
                      Hypotheses → Confidence Decay)

Agent ──► MCP Server ──► Postgres + pgvector
                │
                └──► Background Consolidation Pipeline
                     (Episodes → Facts → Relationships →
                      Causal Links → Goals → Failures →
                      Hypotheses → Confidence Decay)

Quick Start (Docker — Recommended)

快速开始（Docker — 推荐方式）

bash

git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .env

bash

git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .env

Edit .env with your LLM API key and model

docker compose up


This starts Postgres with pgvector, runs migrations, and launches the MCP server with background consolidation.

docker compose up


此命令会启动搭载pgvector的Postgres，执行数据库迁移，并启动带有后台整合流水线的MCP服务器。

Environment Configuration

环境配置

bash

undefined

bash

undefined

.env

LLM provider (OpenAI-compatible endpoint)

LLM_BASE_URL=https://api.openai.com/v1 LLM_API_KEY=$OPENAI_API_KEY LLM_MODEL=gpt-4o-mini

Or use Ollama (local)

LLM_BASE_URL=http://localhost:11434/v1

LLM_API_KEY=ollama

LLM_MODEL=llama3.2

Or OpenRouter

LLM_BASE_URL=https://openrouter.ai/api/v1

LLM_API_KEY=$OPENROUTER_API_KEY

LLM_MODEL=anthropic/claude-3-haiku

Postgres connection

DATABASE_URL=postgres://stash:stash@localhost:5432/stash?sslmode=disable

MCP server

MCP_SERVER_ADDR=:8080

Consolidation pipeline interval

CONSOLIDATION_INTERVAL=5m

undefined

CONSOLIDATION_INTERVAL=5m

undefined

Binary / Manual Install

二进制文件 / 手动安装

bash

git clone https://github.com/alash3al/stash.git
cd stash

bash

git clone https://github.com/alash3al/stash.git
cd stash

Build the binary

go build -o stash ./cmd/stash

Run migrations and start server

./stash serve

undefined

./stash serve

undefined

Connecting MCP Clients

连接MCP客户端

Claude Desktop (

~/Library/Application Support/Claude/claude_desktop_config.json

)

Claude Desktop（

~/Library/Application Support/Claude/claude_desktop_config.json

）

json

{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}

json

{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}

Cursor / Windsurf / Cline (

.cursor/mcp.json

or equivalent)

Cursor / Windsurf / Cline（

.cursor/mcp.json

或等效配置文件）

json

{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}

json

{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/mcp",
      "transport": "http"
    }
  }
}

Continue (

~/.continue/config.json

)

Continue（

~/.continue/config.json

）

json

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "http",
          "url": "http://localhost:8080/mcp"
        }
      }
    ]
  }
}

json

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "http",
          "url": "http://localhost:8080/mcp"
        }
      }
    ]
  }
}

MCP Tools Exposed to Agents

向Agent暴露的MCP工具

Stash exposes these tools via MCP that agents call automatically:

Tool	Purpose
`stash_remember`	Store an episode or observation
`stash_recall`	Semantic search across memory
`stash_facts`	Query consolidated facts
`stash_context`	Get/set working context
`stash_forget`	Remove specific memories

Stash通过MCP向Agent暴露以下工具，Agent可自动调用：

工具	用途
`stash_remember`	存储事件或观测数据
`stash_recall`	对内存进行语义搜索
`stash_facts`	查询已整合的事实
`stash_context`	获取/设置工作上下文
`stash_forget`	删除特定内存

Using Stash Programmatically (Go)

编程式使用Stash（Go语言）

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/alash3al/stash/pkg/client"
)

func main() {
    c, err := client.New(client.Config{
        BaseURL: "http://localhost:8080",
    })
    if err != nil {
        log.Fatal(err)
    }

    ctx := context.Background()

    // Store an episode
    err = c.Remember(ctx, client.Episode{
        AgentID: "my-agent",
        Content: "User prefers dark mode and uses vim keybindings",
        Tags:    []string{"preferences", "ui"},
    })
    if err != nil {
        log.Fatal(err)
    }

    // Recall relevant memories
    results, err := c.Recall(ctx, client.RecallQuery{
        AgentID: "my-agent",
        Query:   "what are the user's editor preferences?",
        Limit:   5,
    })
    if err != nil {
        log.Fatal(err)
    }

    for _, r := range results {
        fmt.Printf("[%.2f] %s\n", r.Score, r.Content)
    }
}

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/alash3al/stash/pkg/client"
)

func main() {
    c, err := client.New(client.Config{
        BaseURL: "http://localhost:8080",
    })
    if err != nil {
        log.Fatal(err)
    }

    ctx := context.Background()

    // Store an episode
    err = c.Remember(ctx, client.Episode{
        AgentID: "my-agent",
        Content: "User prefers dark mode and uses vim keybindings",
        Tags:    []string{"preferences", "ui"},
    })
    if err != nil {
        log.Fatal(err)
    }

    // Recall relevant memories
    results, err := c.Recall(ctx, client.RecallQuery{
        AgentID: "my-agent",
        Query:   "what are the user's editor preferences?",
        Limit:   5,
    })
    if err != nil {
        log.Fatal(err)
    }

    for _, r := range results {
        fmt.Printf("[%.2f] %s\n", r.Score, r.Content)
    }
}

Docker Compose (Full Reference)

Docker Compose（完整参考）

yaml

undefined

yaml

undefined

docker-compose.yml (from repo)

services: postgres: image: pgvector/pgvector:pg16 environment: POSTGRES_USER: stash POSTGRES_PASSWORD: stash POSTGRES_DB: stash volumes: - pgdata:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U stash"] interval: 5s timeout: 5s retries: 5

stash: build: . env_file: .env environment: DATABASE_URL: postgres://stash:stash@postgres:5432/stash?sslmode=disable ports: - "8080:8080" depends_on: postgres: condition: service_healthy

volumes: pgdata:

undefined

stash: build: . env_file: .env environment: DATABASE_URL: postgres://stash:stash@postgres:5432/stash?sslmode=disable ports: - "8080:8080" depends_on: postgres: condition: service_healthy

volumes: pgdata:

undefined

Consolidation Pipeline

整合流水线

The 8-stage pipeline runs on a configurable interval (default 5 minutes) and processes only new data since the last run:

Episodes — raw observations stored by agents
Facts — discrete true/false statements extracted from episodes
Relationships — links between facts and entities
Causal Links — cause-and-effect patterns
Goal Tracking — inferred agent/user goals
Failure Patterns — what went wrong and why
Hypothesis Verification — testing inferred beliefs against new data
Confidence Decay — reducing confidence in stale/unconfirmed facts

To trigger consolidation manually (if supported):

bash

curl -X POST http://localhost:8080/consolidate

8阶段流水线按可配置的时间间隔运行（默认5分钟），仅处理上次运行后新增的数据：

事件 — Agent存储的原始观测数据
事实 — 从事件中提取的离散真假陈述
关联关系 — 事实与实体之间的链接
因果关联 — 因果模式
目标追踪 — 推断出的Agent/用户目标
失败模式 — 问题及原因分析
假设验证 — 用新数据测试推断出的结论
置信度衰减 — 降低陈旧/未确认事实的置信度

手动触发整合（若支持）：

bash

curl -X POST http://localhost:8080/consolidate

Working Context API

工作上下文API

Working context is a scratchpad for in-flight session state:

bash

undefined

工作上下文是用于临时会话状态的草稿本：

bash

undefined

Set context

设置上下文

curl -X PUT http://localhost:8080/api/context/my-agent
-H "Content-Type: application/json"
-d '{"key": "current_task", "value": "debugging auth middleware"}'

Get context

获取上下文

curl http://localhost:8080/api/context/my-agent

undefined

curl http://localhost:8080/api/context/my-agent

undefined

Common Patterns

常见使用模式

Pattern 1: Agent with Memory in Python (via MCP HTTP)

模式1：带内存的Python Agent（通过MCP HTTP）

python

import requests

STASH_URL = "http://localhost:8080"

def remember(agent_id: str, content: str, tags: list[str] = None):
    requests.post(f"{STASH_URL}/api/episodes", json={
        "agent_id": agent_id,
        "content": content,
        "tags": tags or [],
    })

def recall(agent_id: str, query: str, limit: int = 5) -> list[dict]:
    r = requests.post(f"{STASH_URL}/api/recall", json={
        "agent_id": agent_id,
        "query": query,
        "limit": limit,
    })
    return r.json().get("results", [])

python

import requests

STASH_URL = "http://localhost:8080"

def remember(agent_id: str, content: str, tags: list[str] = None):
    requests.post(f"{STASH_URL}/api/episodes", json={
        "agent_id": agent_id,
        "content": content,
        "tags": tags or [],
    })

def recall(agent_id: str, query: str, limit: int = 5) -> list[dict]:
    r = requests.post(f"{STASH_URL}/api/recall", json={
        "agent_id": agent_id,
        "query": query,
        "limit": limit,
    })
    return r.json().get("results", [])

Usage

remember("assistant-1", "User is building a Go microservice with gRPC") memories = recall("assistant-1", "what is the user working on?") for m in memories: print(f"[{m['score']:.2f}] {m['content']}")

undefined

remember("assistant-1", "User is building a Go microservice with gRPC") memories = recall("assistant-1", "what is the user working on?") for m in memories: print(f"[{m['score']:.2f}] {m['content']}")

undefined

Pattern 2: Injecting Memory into System Prompt

模式2：将内存注入系统提示词

python

def build_system_prompt(agent_id: str, base_prompt: str, user_message: str) -> str:
    memories = recall(agent_id, user_message, limit=10)
    if not memories:
        return base_prompt

    memory_block = "\n".join(f"- {m['content']}" for m in memories)
    return f"""{base_prompt}

python

def build_system_prompt(agent_id: str, base_prompt: str, user_message: str) -> str:
    memories = recall(agent_id, user_message, limit=10)
    if not memories:
        return base_prompt

    memory_block = "\n".join(f"- {m['content']}" for m in memories)
    return f"""{base_prompt}

Relevant Memory

{memory_block} """

undefined

{memory_block} """

undefined

Pattern 3: OpenAI Agents SDK Integration

模式3：OpenAI Agents SDK集成

python

from agents import Agent, Runner
from agents.mcp import MCPServerHTTP

stash_mcp = MCPServerHTTP(url="http://localhost:8080/mcp")

agent = Agent(
    name="my-agent",
    instructions="You have persistent memory. Use stash tools to remember and recall.",
    mcp_servers=[stash_mcp],
)

result = Runner.run_sync(agent, "What do you remember about my coding preferences?")
print(result.final_output)

python

from agents import Agent, Runner
from agents.mcp import MCPServerHTTP

stash_mcp = MCPServerHTTP(url="http://localhost:8080/mcp")

agent = Agent(
    name="my-agent",
    instructions="You have persistent memory. Use stash tools to remember and recall.",
    mcp_servers=[stash_mcp],
)

result = Runner.run_sync(agent, "What do you remember about my coding preferences?")
print(result.final_output)

Troubleshooting

故障排查

Postgres connection refused

Postgres连接被拒绝

bash

undefined

bash

undefined

Check pgvector extension is available

检查pgvector扩展是否可用

docker exec -it stash-postgres-1 psql -U stash -c "SELECT * FROM pg_extension WHERE extname='vector';"

If missing, install it

若缺失则安装

docker exec -it stash-postgres-1 psql -U stash -c "CREATE EXTENSION vector;"

undefined

docker exec -it stash-postgres-1 psql -U stash -c "CREATE EXTENSION vector;"

undefined

MCP server not reachable from Claude Desktop

Claude Desktop无法连接MCP服务器

Ensure
```
http://localhost:8080/mcp
```
is accessible (not
```
https
```
)
Check Claude Desktop supports HTTP MCP transport (requires Claude Desktop ≥ 0.10)
Try
```
curl http://localhost:8080/mcp
```
to verify the server is up

确保
```
http://localhost:8080/mcp
```
可访问（不要使用
```
https
```
）
检查Claude Desktop是否支持HTTP MCP传输（要求Claude Desktop ≥ 0.10版本）
尝试执行
```
curl http://localhost:8080/mcp
```
验证服务器是否正常运行

Consolidation not running

整合流水线未运行

bash

undefined

bash

undefined

Check logs for consolidation pipeline errors

检查整合流水线的日志错误

docker compose logs stash | grep -i consolidat

Verify LLM credentials are correct — consolidation uses the LLM to extract facts

验证LLM凭证是否正确 — 整合过程需要LLM提取事实

curl $LLM_BASE_URL/models -H "Authorization: Bearer $LLM_API_KEY"

undefined

curl $LLM_BASE_URL/models -H "Authorization: Bearer $LLM_API_KEY"

undefined

Embedding/recall returning no results

嵌入/召回无结果返回

Consolidation may not have run yet (wait one interval or trigger manually)
Verify the LLM model supports embeddings or that a separate embedding model is configured

Check that episodes were actually stored:

curl http://localhost:8080/api/episodes?agent_id=my-agent

整合可能尚未运行（等待一个间隔周期或手动触发）
验证LLM模型是否支持嵌入，或是否已配置独立的嵌入模型

检查事件是否已实际存储：

curl http://localhost:8080/api/episodes?agent_id=my-agent

Resetting all memory

重置所有内存

bash

undefined

bash

undefined

Nuclear option: wipe and restart

终极方案：清空并重启

docker compose down -v docker compose up

undefined

docker compose down -v docker compose up

undefined

Key Endpoints Reference

核心端点参考

Method	Path	Description
`POST`	`/api/episodes`	Store a new episode
`POST`	`/api/recall`	Semantic recall query
`GET`	`/api/facts`	List consolidated facts
`GET/PUT`	`/api/context/:agent_id`	Working context
`DELETE`	`/api/episodes/:id`	Forget an episode
`POST`	`/consolidate`	Trigger consolidation manually
`GET`	`/health`	Health check
`*`	`/mcp`	MCP protocol endpoint

请求方法	路径	描述
`POST`	`/api/episodes`	存储新事件
`POST`	`/api/recall`	语义召回查询
`GET`	`/api/facts`	列出已整合的事实
`GET/PUT`	`/api/context/:agent_id`	工作上下文
`DELETE`	`/api/episodes/:id`	删除指定事件
`POST`	`/consolidate`	手动触发整合
`GET`	`/health`	健康检查
`*`	`/mcp`	MCP协议端点

Self-Hosting Checklist

自托管检查清单

Postgres 16+ with pgvector extension
LLM API key with access to a chat-completion model
Port 8080 accessible to your MCP clients
Volume mounted for Postgres data persistence
```
CONSOLIDATION_INTERVAL
```
tuned to your usage (default
```
5m
```
)
Agent IDs are consistent across sessions for memory continuity

搭载pgvector扩展的Postgres 16+版本
拥有聊天补全模型访问权限的LLM API密钥
MCP客户端可访问8080端口
已挂载Postgres数据持久化卷
根据使用场景调整
```
CONSOLIDATION_INTERVAL
```
（默认
```
5m
```
）
跨会话使用一致的Agent ID以保证内存连续性

stash-ai-memory

Original

Translation

Stash AI Memory

Stash AI 内存

Architecture

架构

Quick Start (Docker — Recommended)

快速开始（Docker — 推荐方式）

Edit .env with your LLM API key and model

Edit .env with your LLM API key and model

Environment Configuration

环境配置

.env

.env

LLM provider (OpenAI-compatible endpoint)

LLM provider (OpenAI-compatible endpoint)

Or use Ollama (local)

Or use Ollama (local)

LLM_BASE_URL=http://localhost:11434/v1

LLM_BASE_URL=http://localhost:11434/v1

LLM_API_KEY=ollama

LLM_API_KEY=ollama

LLM_MODEL=llama3.2

LLM_MODEL=llama3.2

Or OpenRouter

Or OpenRouter

LLM_BASE_URL=https://openrouter.ai/api/v1

LLM_BASE_URL=https://openrouter.ai/api/v1

LLM_API_KEY=$OPENROUTER_API_KEY

LLM_API_KEY=$OPENROUTER_API_KEY

LLM_MODEL=anthropic/claude-3-haiku

LLM_MODEL=anthropic/claude-3-haiku

Postgres connection

Postgres connection

MCP server

MCP server

Consolidation pipeline interval

Consolidation pipeline interval

Binary / Manual Install

二进制文件 / 手动安装

Build the binary

Build the binary

Run migrations and start server

Run migrations and start server

Connecting MCP Clients

连接MCP客户端

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json)

Claude Desktop（~/Library/Application Support/Claude/claude_desktop_config.json）

Cursor / Windsurf / Cline (.cursor/mcp.json or equivalent)

Cursor / Windsurf / Cline（.cursor/mcp.json或等效配置文件）

Continue (~/.continue/config.json)

Continue（~/.continue/config.json）

MCP Tools Exposed to Agents

向Agent暴露的MCP工具

Using Stash Programmatically (Go)

编程式使用Stash（Go语言）

Docker Compose (Full Reference)

Docker Compose（完整参考）

docker-compose.yml (from repo)

docker-compose.yml (from repo)

Consolidation Pipeline

整合流水线

Working Context API

工作上下文API

Set context

设置上下文

Get context

获取上下文

Common Patterns

常见使用模式

Pattern 1: Agent with Memory in Python (via MCP HTTP)

模式1：带内存的Python Agent（通过MCP HTTP）

Usage

Usage

Pattern 2: Injecting Memory into System Prompt

模式2：将内存注入系统提示词

Relevant Memory

Relevant Memory

Pattern 3: OpenAI Agents SDK Integration

Claude Desktop (
`~/Library/Application Support/Claude/claude_desktop_config.json`
)

Claude Desktop（
`~/Library/Application Support/Claude/claude_desktop_config.json`
）

Cursor / Windsurf / Cline (
`.cursor/mcp.json`
or equivalent)

Cursor / Windsurf / Cline（
`.cursor/mcp.json`
或等效配置文件）

Continue (
`~/.continue/config.json`
)

Continue（
`~/.continue/config.json`
）