chroma-local
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseInstructions
说明
Determine these before writing code. Prefer discovering them from the repo and the user request. Ask only when the choice materially changes the implementation.
-
Runtime shape
- Are they connecting to a running local server, embedding Chroma into tests, or setting up local development from scratch?
- Decide whether they need , a Docker or service command,
chroma runorHttpClient, or PythonChromaClient.EphemeralClient
-
Persistence
- Persistent local data: choose an intentional data path.
- Disposable test data: use defaults or a temp directory.
-
Embedding model
- Reuse the app's existing embedding provider when possible.
- Otherwise default to in TypeScript or the standard local default in Python.
@chroma-core/default-embed - If the user explicitly wants OpenAI embeddings in TypeScript, install and use .
@chroma-core/openai
-
Indexed data shape
- Determine what is being indexed, how it should be chunked, and what metadata is needed for filtering and updates.
编写代码前请先确定以下内容。优先从代码库和用户请求中获取这些信息,仅当选择会对实现产生实质性影响时才询问用户。
-
运行时形态
- 他们是要连接到正在运行的本地服务器、在测试中嵌入Chroma,还是从头开始搭建本地开发环境?
- 确定他们是否需要、Docker或服务命令、
chroma run或HttpClient,还是PythonChromaClient。EphemeralClient
-
持久化
- 持久化本地数据:选择一个明确的数据路径。
- 一次性测试数据:使用默认路径或临时目录。
-
嵌入模型
- 尽可能复用应用现有的嵌入提供商。
- 否则在TypeScript中默认使用,在Python中使用标准本地默认模型。
@chroma-core/default-embed - 如果用户明确要求在TypeScript中使用OpenAI嵌入,请安装并使用。
@chroma-core/openai
-
索引数据形态
- 确定要索引的内容、如何分块,以及过滤和更新所需的元数据。
Routing
路由规则
-
Existing local server
- Confirm host and port before changing client code.
- Validate the server is reachable before assuming collections are missing.
-
Fresh local development
- Add a local startup path such as or the repo's existing Docker or service command.
chroma run - Default to unless the repo already uses another address.
localhost:8000
- Add a local startup path such as
-
Python tests or disposable local workflows
- Prefer when persistence is unnecessary.
EphemeralClient - Call out that data is lost when the process exits.
- Prefer
-
Persistent local development
- Use a stable data path and make persistence explicit in code or config.
- Do not silently switch between ephemeral and persistent modes.
-
Search integration work
- Use in TypeScript or
getOrCreateCollection()in Python.get_or_create_collection() - Design document IDs and metadata so upserts and deletes are straightforward.
- Batch writes when syncing large datasets.
- Use
-
现有本地服务器
- 修改客户端代码前先确认主机和端口。
- 在假设集合缺失前,先验证服务器是否可访问。
-
全新本地开发环境
- 添加本地启动命令,如或代码库中已有的Docker或服务命令。
chroma run - 除非代码库已使用其他地址,否则默认使用。
localhost:8000
- 添加本地启动命令,如
-
Python测试或一次性本地工作流
- 当不需要持久化时,优先使用。
EphemeralClient - 需说明进程退出时数据会丢失。
- 当不需要持久化时,优先使用
-
持久化本地开发环境
- 使用稳定的数据路径,并在代码或配置中明确持久化模式。
- 不要在临时模式和持久化模式之间静默切换。
-
搜索集成工作
- 在TypeScript中使用,在Python中使用
getOrCreateCollection()。get_or_create_collection() - 设计文档ID和元数据,使更新和删除操作简单直接。
- 同步大型数据集时使用批量写入。
- 在TypeScript中使用
Ask vs proceed
询问与直接处理
Ask first:
- Embedding model choice (cost and quality implications)
- Whether they need persistent local data
- How they are starting the local server
- Multi-tenant data isolation strategy
Proceed with sensible defaults:
- Use (TypeScript) /
getOrCreateCollection()(Python)get_or_create_collection() - Use cosine similarity (most common)
- Chunk size under 8KB
- Store source IDs in metadata for updates/deletes
- Use a local server on unless the repo already configures another address or is using Python
localhost:8000EphemeralClient
需先询问用户的情况:
- 嵌入模型的选择(涉及成本和质量影响)
- 是否需要持久化本地数据
- 如何启动本地服务器
- 多租户数据隔离策略
可使用合理默认值直接处理的情况:
- 使用(TypeScript)/
getOrCreateCollection()(Python)get_or_create_collection() - 使用余弦相似度(最常用)
- 分块大小不超过8KB
- 在元数据中存储源ID以便更新/删除
- 除非代码库已配置其他地址或使用Python ,否则使用
EphemeralClient上的本地服务器localhost:8000
What to validate
验证要点
- Correct client import (,
ChromaClient, orHttpClient)Client - Embedding function package is installed (TypeScript)
- Local server is reachable before assuming collections are missing
- Local path and persistence mode are intentional
- 客户端导入正确(、
ChromaClient或HttpClient)Client - 已安装嵌入函数包(TypeScript)
- 在假设集合缺失前,先验证本地服务器是否可访问
- 本地路径和持久化模式是明确设置的
Implementation notes
实现注意事项
- Local Chroma is the right default for development, tests, and self-hosted deployments.
- OSS Chroma does not include Chroma Cloud-only features such as and
Schema().Search() - If the user asks for hybrid dense and sparse retrieval, treat that as a likely Chroma Cloud requirement unless the repo already implements an OSS workaround.
- For open source Chroma, dense retrieval with a single embedding function is the normal baseline.
- 本地Chroma是开发、测试和自托管部署的合适默认选择。
- 开源版Chroma不包含仅Chroma Cloud有的功能,如和
Schema()。Search() - 如果用户要求混合稠密和稀疏检索,除非代码库已实现开源解决方案,否则将其视为可能需要Chroma Cloud的需求。
- 对于开源Chroma,使用单个嵌入函数的稠密检索是常规基准。
Minimal patterns
最简示例
Start a local Chroma server when the repo needs one:
bash
chroma runDefault address: .
localhost:8000TypeScript local client:
typescript
import { ChromaClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';
const client = new ChromaClient();
const embeddingFunction = new DefaultEmbeddingFunction();
const collection = await client.getOrCreateCollection({
name: 'my_collection',
embeddingFunction,
});
// Add documents
await collection.add({
ids: ['doc1', 'doc2'],
documents: ['First document text', 'Second document text'],
});
// Query
const results = await collection.query({
queryTexts: ['search query'],
nResults: 5,
});Python local client:
python
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_or_create_collection(name="my_collection")当代码库需要时,启动本地Chroma服务器:
bash
chroma run默认地址:。
localhost:8000TypeScript本地客户端:
typescript
import { ChromaClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';
const client = new ChromaClient();
const embeddingFunction = new DefaultEmbeddingFunction();
const collection = await client.getOrCreateCollection({
name: 'my_collection',
embeddingFunction,
});
// 添加文档
await collection.add({
ids: ['doc1', 'doc2'],
documents: ['First document text', 'Second document text'],
});
// 查询
const results = await collection.query({
queryTexts: ['search query'],
nResults: 5,
});Python本地客户端:
python
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_or_create_collection(name="my_collection")Add documents
添加文档
collection.add(
ids=["doc1", "doc2"] ,
documents=["First document text", "Second document text"],
)
collection.add(
ids=["doc1", "doc2"] ,
documents=["First document text", "Second document text"],
)
Query
查询
results = collection.query(
query_texts=["search query"],
n_results=5,
)
undefinedresults = collection.query(
query_texts=["search query"],
n_results=5,
)
undefinedLearn More
了解更多
Fetch Chroma's only when you need API or product details that are not already in the repo or this skill: https://docs.trychroma.com/llms.txt
llms.txt仅当需要代码库或本技能中未涵盖的API或产品细节时,才获取Chroma的:https://docs.trychroma.com/llms.txt
llms.txtAvailable Topics
可用主题
Typescript
Typescript
- Chroma Regex Filtering - Learn how to use regex filters in Chroma queries
- Query and Get - Query and Get Data from Chroma Collections
- Metadata - Store and query metadata, including filters and array values
- Updating and Deleting - Update existing documents and delete data from collections
- Error Handling - Handling errors and failures when working with Chroma
- Local Chroma - How to run and use local chroma
- Chroma Regex Filtering - 学习如何在Chroma查询中使用正则表达式过滤器
- Query and Get - 从Chroma集合中查询和获取数据
- Metadata - 存储和查询元数据,包括过滤器和数组值
- Updating and Deleting - 更新现有文档并从集合中删除数据
- Error Handling - 使用Chroma时处理错误和故障
- Local Chroma - 如何运行和使用本地Chroma
Python
Python
- Chroma Regex Filtering - Learn how to use regex filters in Chroma queries
- Query and Get - Query and Get Data from Chroma Collections
- Metadata - Store and query metadata, including filters and array values
- Updating and Deleting - Update existing documents and delete data from collections
- Error Handling - Handling errors and failures when working with Chroma
- Local Chroma - How to run and use local chroma
- Chroma Regex Filtering - 学习如何在Chroma查询中使用正则表达式过滤器
- Query and Get - 从Chroma集合中查询和获取数据
- Metadata - 存储和查询元数据,包括过滤器和数组值
- Updating and Deleting - 更新现有文档并从集合中删除数据
- Error Handling - 使用Chroma时处理错误和故障
- Local Chroma - 如何运行和使用本地Chroma
General
通用主题
- Data Model - An overview of how Chroma stores data
- Integrating Chroma into an existing system - Guidance for adding Chroma search to an existing application
- Chroma CLI - Starting and managing a local open source Chroma server from the CLI
- Data Model - Chroma数据存储方式概述
- Integrating Chroma into an existing system - 向现有应用添加Chroma搜索的指南
- Chroma CLI - 通过CLI启动和管理本地开源Chroma服务器