rag-systems

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

RAG Systems

RAG系统

Build Retrieval-Augmented Generation systems for grounded responses.
构建用于生成有依据回复的检索增强生成(Retrieval-Augmented Generation)系统。

When to Use This Skill

何时使用此技能

Invoke this skill when:
  • Building Q&A over custom documents
  • Implementing semantic search
  • Setting up vector databases
  • Optimizing retrieval quality
在以下场景调用此技能:
  • 构建基于自定义文档的问答系统
  • 实现语义搜索
  • 搭建向量数据库
  • 优化检索质量

Parameter Schema

参数 Schema

ParameterTypeRequiredDescriptionDefault
task
stringYesRAG goal-
vector_db
enumNo
pinecone
,
weaviate
,
chroma
,
pgvector
chroma
embedding_model
stringNoEmbedding model
text-embedding-3-small
chunk_size
intNoChunk size in chars
1000
参数类型是否必填描述默认值
task
stringRAG 目标-
vector_db
枚举
pinecone
,
weaviate
,
chroma
,
pgvector
chroma
embedding_model
string嵌入模型
text-embedding-3-small
chunk_size
int文本块字符数
1000

Quick Start

快速开始

python
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter
python
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

1. Split documents

1. Split documents

splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) chunks = splitter.split_documents(documents)
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) chunks = splitter.split_documents(documents)

2. Create vector store

2. Create vector store

embeddings = OpenAIEmbeddings(model="text-embedding-3-small") vectorstore = Chroma.from_documents(chunks, embeddings)
embeddings = OpenAIEmbeddings(model="text-embedding-3-small") vectorstore = Chroma.from_documents(chunks, embeddings)

3. Retrieve

3. Retrieve

docs = vectorstore.similarity_search("query", k=5)
undefined
docs = vectorstore.similarity_search("query", k=5)
undefined

Chunking Strategy

文本分块策略

Content TypeSizeOverlapRationale
Technical docs500-800100Preserve code
Legal docs1000-1500200Keep clauses
Q&A/FAQ200-40050Atomic answers
内容类型大小重叠量理由
技术文档500-800100保留代码完整性
法律文档1000-1500200保持条款完整性
问答/常见问题200-40050确保答案原子性

Embedding Costs

嵌入模型成本

ModelCost/1M tokens
text-embedding-3-small$0.02
text-embedding-3-large$0.13
Cohere embed-v3$0.10
模型每百万token成本
text-embedding-3-small$0.02
text-embedding-3-large$0.13
Cohere embed-v3$0.10

Troubleshooting

故障排查

IssueSolution
Irrelevant resultsImprove chunking, add reranking
Missing contextIncrease k, use parent retriever
HallucinationsAdd "only use context" prompt
Slow retrievalAdd caching, reduce k
问题解决方案
结果不相关优化文本分块,添加重排序机制
上下文缺失增大k值,使用父文档检索器
幻觉问题添加“仅使用给定上下文”的提示词
检索速度慢添加缓存,减小k值

Best Practices

最佳实践

  • Always include source attribution
  • Use hybrid search (dense + BM25)
  • Implement reranking for quality
  • Evaluate with RAGAS metrics
  • 始终包含来源归因
  • 使用混合搜索(密集向量+BM25)
  • 实现重排序以提升质量
  • 使用RAGAS指标进行评估

Related Skills

相关技能

  • llm-integration
    - LLM for generation
  • agent-memory
    - Memory retrieval
  • ai-agent-basics
    - Agentic RAG
  • llm-integration
    - 用于生成的LLM
  • agent-memory
    - 记忆检索
  • ai-agent-basics
    - 智能体式RAG

References

参考资料