llamaindex-development
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLlamaIndex Development
LlamaIndex 开发
You are an expert in LlamaIndex for building RAG (Retrieval-Augmented Generation) applications, data indexing, and LLM-powered applications with Python.
你是LlamaIndex领域的专家,擅长使用Python构建RAG(检索增强生成)应用、数据索引以及基于大语言模型(LLM)的应用。
Key Principles
核心原则
- Write concise, technical responses with accurate Python examples
- Use functional, declarative programming; avoid classes where possible
- Prioritize code quality, maintainability, and performance
- Use descriptive variable names that reflect their purpose
- Follow PEP 8 style guidelines
- 撰写简洁、专业的技术回复,附带准确的Python示例
- 使用函数式、声明式编程;尽可能避免使用类
- 优先考虑代码质量、可维护性和性能
- 使用能体现用途的描述性变量名
- 遵循PEP 8编码风格指南
Code Organization
代码组织
Directory Structure
目录结构
project/
├── data/ # Source documents and data
├── indexes/ # Persisted index storage
├── loaders/ # Custom document loaders
├── retrievers/ # Custom retriever implementations
├── query_engines/ # Query engine configurations
├── prompts/ # Custom prompt templates
├── transformations/ # Document transformations
├── callbacks/ # Custom callback handlers
├── utils/ # Utility functions
├── tests/ # Test files
└── config/ # Configuration filesproject/
├── data/ # 源文档和数据
├── indexes/ # 持久化索引存储
├── loaders/ # 自定义文档加载器
├── retrievers/ # 自定义检索器实现
├── query_engines/ # 查询引擎配置
├── prompts/ # 自定义提示词模板
├── transformations/ # 文档转换
├── callbacks/ # 自定义回调处理器
├── utils/ # 工具函数
├── tests/ # 测试文件
└── config/ # 配置文件Naming Conventions
命名规范
- Use snake_case for files, functions, and variables
- Use PascalCase for classes
- Prefix private functions with underscore
- Use descriptive names (e.g., ,
create_vector_index)build_query_engine
- 文件、函数和变量使用蛇形命名法(snake_case)
- 类使用帕斯卡命名法(PascalCase)
- 私有函数以下划线开头
- 使用描述性名称(例如:,
create_vector_index)build_query_engine
Document Loading
文档加载
Using Document Loaders
使用文档加载器
python
from llama_index.core import SimpleDirectoryReader
from llama_index.readers.file import PDFReader, DocxReaderpython
from llama_index.core import SimpleDirectoryReader
from llama_index.readers.file import PDFReader, DocxReaderLoad from directory
从目录加载
documents = SimpleDirectoryReader(
input_dir="./data",
recursive=True,
required_exts=[".pdf", ".txt", ".md"]
).load_data()
documents = SimpleDirectoryReader(
input_dir="./data",
recursive=True,
required_exts=[".pdf", ".txt", ".md"]
).load_data()
Load specific file types
加载特定文件类型
pdf_reader = PDFReader()
documents = pdf_reader.load_data(file="document.pdf")
undefinedpdf_reader = PDFReader()
documents = pdf_reader.load_data(file="document.pdf")
undefinedCustom Loaders
自定义加载器
python
from llama_index.core.readers.base import BaseReader
from llama_index.core import Document
class CustomLoader(BaseReader):
def load_data(self, file_path: str) -> list[Document]:
# Custom loading logic
with open(file_path, 'r') as f:
content = f.read()
return [Document(
text=content,
metadata={"source": file_path}
)]python
from llama_index.core.readers.base import BaseReader
from llama_index.core import Document
class CustomLoader(BaseReader):
def load_data(self, file_path: str) -> list[Document]:
# 自定义加载逻辑
with open(file_path, 'r') as f:
content = f.read()
return [Document(
text=content,
metadata={"source": file_path}
)]Text Splitting and Processing
文本拆分与处理
Node Parsing
节点解析
python
from llama_index.core.node_parser import (
SentenceSplitter,
SemanticSplitterNodeParser,
MarkdownNodeParser
)python
from llama_index.core.node_parser import (
SentenceSplitter,
SemanticSplitterNodeParser,
MarkdownNodeParser
)Simple sentence splitting
简单句子拆分
splitter = SentenceSplitter(
chunk_size=1024,
chunk_overlap=200
)
nodes = splitter.get_nodes_from_documents(documents)
splitter = SentenceSplitter(
chunk_size=1024,
chunk_overlap=200
)
nodes = splitter.get_nodes_from_documents(documents)
Semantic splitting (preserves meaning)
语义拆分(保留语义)
from llama_index.embeddings.openai import OpenAIEmbedding
semantic_splitter = SemanticSplitterNodeParser(
embed_model=OpenAIEmbedding(),
breakpoint_percentile_threshold=95
)
from llama_index.embeddings.openai import OpenAIEmbedding
semantic_splitter = SemanticSplitterNodeParser(
embed_model=OpenAIEmbedding(),
breakpoint_percentile_threshold=95
)
Markdown-aware splitting
支持Markdown的拆分
markdown_splitter = MarkdownNodeParser()
undefinedmarkdown_splitter = MarkdownNodeParser()
undefinedBest Practices for Chunking
分块最佳实践
- Choose chunk size based on your embedding model's context window
- Use overlap to maintain context between chunks
- Preserve document structure when possible
- Include metadata for filtering and retrieval
- Use semantic splitting for better coherence
- 根据嵌入模型的上下文窗口选择分块大小
- 使用重叠区域来维持块之间的上下文关联
- 尽可能保留文档结构
- 包含元数据用于过滤和检索
- 使用语义拆分以获得更好的连贯性
Vector Stores and Indexing
向量存储与索引
Creating Indexes
创建索引
python
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadbpython
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadbIn-memory index
内存索引
index = VectorStoreIndex.from_documents(documents)
index = VectorStoreIndex.from_documents(documents)
With persistent vector store
带持久化向量存储的索引
chroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("my_collection")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context
)
undefinedchroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("my_collection")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context
)
undefinedSupported Vector Stores
支持的向量存储
- Chroma (local development)
- Pinecone (production, managed)
- Weaviate (production, self-hosted or managed)
- Qdrant (production, self-hosted or managed)
- PostgreSQL with pgvector
- MongoDB Atlas Vector Search
- Chroma(本地开发)
- Pinecone(生产环境,托管式)
- Weaviate(生产环境,自托管或托管式)
- Qdrant(生产环境,自托管或托管式)
- 带pgvector的PostgreSQL
- MongoDB Atlas向量搜索
Index Persistence
索引持久化
python
from llama_index.core import StorageContext, load_index_from_storagepython
from llama_index.core import StorageContext, load_index_from_storagePersist index
持久化索引
index.storage_context.persist(persist_dir="./storage")
index.storage_context.persist(persist_dir="./storage")
Load index
加载索引
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
undefinedstorage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
undefinedQuery Engines
查询引擎
Basic Query Engine
基础查询引擎
python
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(
similarity_top_k=5,
response_mode="compact"
)
response = query_engine.query("What is the main topic?")
print(response.response)python
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(
similarity_top_k=5,
response_mode="compact"
)
response = query_engine.query("What is the main topic?")
print(response.response)Response Modes
响应模式
- : Iteratively refine answer through each node
refine - : Combine chunks before sending to LLM
compact - : Build tree and summarize
tree_summarize - : Truncate and summarize
simple_summarize - : Accumulate responses from each node
accumulate
- : 遍历每个节点迭代优化答案
refine - : 在发送给LLM前合并分块内容
compact - : 构建树状结构并总结
tree_summarize - : 截断内容并总结
simple_summarize - : 累加每个节点的响应结果
accumulate
Advanced Query Engine
高级查询引擎
python
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
query_engine = RetrieverQueryEngine.from_args(
retriever=index.as_retriever(similarity_top_k=10),
node_postprocessors=[
SimilarityPostprocessor(similarity_cutoff=0.7)
],
response_mode="compact"
)python
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
query_engine = RetrieverQueryEngine.from_args(
retriever=index.as_retriever(similarity_top_k=10),
node_postprocessors=[
SimilarityPostprocessor(similarity_cutoff=0.7)
],
response_mode="compact"
)Retrievers
检索器
Custom Retrievers
自定义检索器
python
from llama_index.core.retrievers import VectorIndexRetrieverpython
from llama_index.core.retrievers import VectorIndexRetrieverBasic retriever
基础检索器
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10
)
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10
)
Retrieve nodes
检索节点
nodes = retriever.retrieve("search query")
undefinednodes = retriever.retrieve("search query")
undefinedHybrid Search
混合搜索
python
from llama_index.core.retrievers import QueryFusionRetrieverpython
from llama_index.core.retrievers import QueryFusionRetrieverCombine multiple retrieval strategies
组合多种检索策略
retriever = QueryFusionRetriever(
[
index.as_retriever(similarity_top_k=5),
bm25_retriever, # Keyword-based
],
num_queries=4,
use_async=True
)
undefinedretriever = QueryFusionRetriever(
[
index.as_retriever(similarity_top_k=5),
bm25_retriever, # 基于关键词的检索器
],
num_queries=4,
use_async=True
)
undefinedEmbeddings
嵌入模型
Embedding Models
嵌入模型配置
python
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settingspython
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import SettingsOpenAI embeddings
OpenAI嵌入模型
Settings.embed_model = OpenAIEmbedding(
model="text-embedding-3-small",
dimensions=512 # Optional dimension reduction
)
Settings.embed_model = OpenAIEmbedding(
model="text-embedding-3-small",
dimensions=512 # 可选的维度缩减
)
Local embeddings
本地嵌入模型
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
undefinedSettings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
undefinedLLM Configuration
LLM配置
Setting Up LLMs
LLM设置
python
from llama_index.llms.openai import OpenAI
from llama_index.llms.anthropic import Anthropic
from llama_index.core import Settingspython
from llama_index.llms.openai import OpenAI
from llama_index.llms.anthropic import Anthropic
from llama_index.core import SettingsOpenAI
OpenAI
Settings.llm = OpenAI(
model="gpt-4o",
temperature=0.1
)
Settings.llm = OpenAI(
model="gpt-4o",
temperature=0.1
)
Anthropic
Anthropic
Settings.llm = Anthropic(
model="claude-sonnet-4-20250514",
temperature=0.1
)
undefinedSettings.llm = Anthropic(
model="claude-sonnet-4-20250514",
temperature=0.1
)
undefinedAgents
智能体(Agents)
Building Agents
构建智能体
python
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool, ToolMetadatapython
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool, ToolMetadataCreate tools from query engines
从查询引擎创建工具
tools = [
QueryEngineTool(
query_engine=documents_query_engine,
metadata=ToolMetadata(
name="documents",
description="Search through documents"
)
),
QueryEngineTool(
query_engine=code_query_engine,
metadata=ToolMetadata(
name="codebase",
description="Search through code"
)
)
]
tools = [
QueryEngineTool(
query_engine=documents_query_engine,
metadata=ToolMetadata(
name="documents",
description="Search through documents"
)
),
QueryEngineTool(
query_engine=code_query_engine,
metadata=ToolMetadata(
name="codebase",
description="Search through code"
)
)
]
Create agent
创建智能体
agent = ReActAgent.from_tools(
tools,
llm=llm,
verbose=True
)
response = agent.chat("Find information about X")
undefinedagent = ReActAgent.from_tools(
tools,
llm=llm,
verbose=True
)
response = agent.chat("Find information about X")
undefinedPerformance Optimization
性能优化
Caching
缓存
python
from llama_index.core import Settings
from llama_index.core.llms import LLMCachepython
from llama_index.core import Settings
from llama_index.core.llms import LLMCacheEnable LLM response caching
启用LLM响应缓存
Settings.llm = OpenAI(model="gpt-4o")
Settings.llm_cache = LLMCache()
undefinedSettings.llm = OpenAI(model="gpt-4o")
Settings.llm_cache = LLMCache()
undefinedAsync Operations
异步操作
python
undefinedpython
undefinedUse async for better performance
使用异步提升性能
response = await query_engine.aquery("question")
response = await query_engine.aquery("question")
Batch processing
批量处理
responses = await asyncio.gather(*[
query_engine.aquery(q) for q in questions
])
undefinedresponses = await asyncio.gather(*[
query_engine.aquery(q) for q in questions
])
undefinedEmbedding Optimization
嵌入优化
- Batch embeddings when possible
- Use smaller embedding dimensions when accuracy allows
- Cache embeddings for repeated documents
- Use local models for cost-sensitive applications
- 尽可能使用批量嵌入
- 在精度允许的情况下使用更小的嵌入维度
- 为重复文档缓存嵌入结果
- 对成本敏感的应用使用本地模型
Error Handling
错误处理
python
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandlerpython
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandlerDebug handler for troubleshooting
用于故障排查的调试处理器
debug_handler = LlamaDebugHandler()
callback_manager = CallbackManager([debug_handler])
Settings.callback_manager = callback_manager
undefineddebug_handler = LlamaDebugHandler()
callback_manager = CallbackManager([debug_handler])
Settings.callback_manager = callback_manager
undefinedTesting
测试
- Unit test document loaders and transformations
- Test retrieval quality with known queries
- Validate index persistence and loading
- Test query engine responses
- Monitor retrieval metrics (precision, recall)
- 对文档加载器和转换逻辑进行单元测试
- 使用已知查询测试检索质量
- 验证索引的持久化与加载功能
- 测试查询引擎的响应结果
- 监控检索指标(精确率、召回率)
Dependencies
依赖项
- llama-index
- llama-index-embeddings-openai
- llama-index-llms-openai
- llama-index-vector-stores-chroma
- chromadb
- python-dotenv
- pydantic
- llama-index
- llama-index-embeddings-openai
- llama-index-llms-openai
- llama-index-vector-stores-chroma
- chromadb
- python-dotenv
- pydantic