rag-systems
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRAG Systems
RAG系统
Build Retrieval-Augmented Generation systems for grounded responses.
构建用于生成有依据回复的检索增强生成(Retrieval-Augmented Generation)系统。
When to Use This Skill
何时使用此技能
Invoke this skill when:
- Building Q&A over custom documents
- Implementing semantic search
- Setting up vector databases
- Optimizing retrieval quality
在以下场景调用此技能:
- 构建基于自定义文档的问答系统
- 实现语义搜索
- 搭建向量数据库
- 优化检索质量
Parameter Schema
参数 Schema
| Parameter | Type | Required | Description | Default |
|---|---|---|---|---|
| string | Yes | RAG goal | - |
| enum | No | | |
| string | No | Embedding model | |
| int | No | Chunk size in chars | |
| 参数 | 类型 | 是否必填 | 描述 | 默认值 |
|---|---|---|---|---|
| string | 是 | RAG 目标 | - |
| 枚举 | 否 | | |
| string | 否 | 嵌入模型 | |
| int | 否 | 文本块字符数 | |
Quick Start
快速开始
python
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitterpython
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter1. Split documents
1. Split documents
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(documents)
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(documents)
2. Create vector store
2. Create vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings)
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings)
3. Retrieve
3. Retrieve
docs = vectorstore.similarity_search("query", k=5)
undefineddocs = vectorstore.similarity_search("query", k=5)
undefinedChunking Strategy
文本分块策略
| Content Type | Size | Overlap | Rationale |
|---|---|---|---|
| Technical docs | 500-800 | 100 | Preserve code |
| Legal docs | 1000-1500 | 200 | Keep clauses |
| Q&A/FAQ | 200-400 | 50 | Atomic answers |
| 内容类型 | 大小 | 重叠量 | 理由 |
|---|---|---|---|
| 技术文档 | 500-800 | 100 | 保留代码完整性 |
| 法律文档 | 1000-1500 | 200 | 保持条款完整性 |
| 问答/常见问题 | 200-400 | 50 | 确保答案原子性 |
Embedding Costs
嵌入模型成本
| Model | Cost/1M tokens |
|---|---|
| text-embedding-3-small | $0.02 |
| text-embedding-3-large | $0.13 |
| Cohere embed-v3 | $0.10 |
| 模型 | 每百万token成本 |
|---|---|
| text-embedding-3-small | $0.02 |
| text-embedding-3-large | $0.13 |
| Cohere embed-v3 | $0.10 |
Troubleshooting
故障排查
| Issue | Solution |
|---|---|
| Irrelevant results | Improve chunking, add reranking |
| Missing context | Increase k, use parent retriever |
| Hallucinations | Add "only use context" prompt |
| Slow retrieval | Add caching, reduce k |
| 问题 | 解决方案 |
|---|---|
| 结果不相关 | 优化文本分块,添加重排序机制 |
| 上下文缺失 | 增大k值,使用父文档检索器 |
| 幻觉问题 | 添加“仅使用给定上下文”的提示词 |
| 检索速度慢 | 添加缓存,减小k值 |
Best Practices
最佳实践
- Always include source attribution
- Use hybrid search (dense + BM25)
- Implement reranking for quality
- Evaluate with RAGAS metrics
- 始终包含来源归因
- 使用混合搜索(密集向量+BM25)
- 实现重排序以提升质量
- 使用RAGAS指标进行评估
Related Skills
相关技能
- - LLM for generation
llm-integration - - Memory retrieval
agent-memory - - Agentic RAG
ai-agent-basics
- - 用于生成的LLM
llm-integration - - 记忆检索
agent-memory - - 智能体式RAG
ai-agent-basics