dspy-code
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDSPy-Code Skill
DSPy-Code 技能
Specialized AI assistant for building LLM applications with DSPy
专注于使用DSPy构建LLM应用的AI助手
When to Use This Skill
何时使用该技能
Activate dspy-code for:
在以下场景激活dspy-code:
Development Tasks
开发任务
- Creating DSPy modules, signatures, and pipelines
- Building RAG (Retrieval-Augmented Generation) systems
- Implementing multi-hop reasoning and complex workflows
- Designing typed outputs with Pydantic schemas
- Creating agents with tool use (ReAct patterns)
- Writing custom metrics and evaluation functions
- 创建DSPy模块、Signature和流水线
- 构建RAG(检索增强生成)系统
- 实现多跳推理和复杂工作流
- 基于Pydantic Schema设计类型化输出
- 创建具备工具调用能力的Agent(ReAct模式)
- 编写自定义指标和评估函数
Optimization Tasks
优化任务
- Running prompt optimization with GEPA
- Compiling modules with BootstrapFewShot, MIPRO, COPRO
- Hyperparameter tuning and grid search
- A/B testing optimized vs unoptimized modules
- Statistical significance testing
- Performance benchmarking
- 使用GEPA运行提示词优化
- 通过BootstrapFewShot、MIPRO、COPRO编译模块
- 超参数调优和网格搜索
- 对比优化前后模块的A/B测试
- 统计显著性测试
- 性能基准测试
Project Management
项目管理
- Initializing new DSPy projects
- Connecting to existing workspaces
- Generating demos from templates
- Validating DSPy code for correctness
- Exporting to Python, JSON, YAML formats
- Session tracking and history
- 初始化新的DSPy项目
- 连接至现有工作区
- 从模板生成演示项目
- 验证DSPy代码的正确性
- 导出为Python、JSON、YAML格式
- 会话追踪与历史记录
Learning & Exploration
学习与探索
- Understanding DSPy patterns and anti-patterns
- Choosing the right predictor for your task
- Selecting optimal optimizers based on data size
- Learning about 10 predictors, 11 optimizers, 4 adapters
- Exploring 12 production-ready module templates
- 理解DSPy的设计模式与反模式
- 为任务选择合适的Predictor
- 根据数据规模选择最优Optimizer
- 了解10种Predictor、11种Optimizer、4种Adapter
- 探索12个生产就绪的模块模板
Key Principle
核心原则
Use dspy-code for ALL DSPy-related development
DSPy is fundamentally different from traditional prompt engineering:
- Programming, not prompting - Write declarative modules, not brittle prompts
- Signatures define contracts - Clear input/output specifications
- Optimizers tune automatically - No manual prompt engineering
- Composition over monoliths - Build complex programs from simple modules
所有DSPy相关开发均使用dspy-code
DSPy与传统提示词工程有着本质区别:
- 编程而非提示 - 编写声明式模块,而非脆弱的提示词
- Signature定义契约 - 清晰的输入/输出规范
- Optimizer自动调优 - 无需手动提示词工程
- 组合优于单体 - 通过简单模块构建复杂程序
Core Capabilities
核心能力
1. Deep DSPy Knowledge
1. 深度DSPy知识
10 Predictor Types:
- - Basic predictor
Predict - - CoT reasoning
ChainOfThought - - CoT with hints
ChainOfThoughtWithHint - - Code execution for math
ProgramOfThought - - Reasoning + Acting for agents
ReAct - - Compare multiple chains
MultiChainComparison - - Document retrieval
Retrieve - - Type-constrained outputs
TypedPredictor - - Multiple predictor voting
Ensemble - - Majority voting aggregation
majority
11 Optimizer Types:
- - Example-based (10-50 examples, ⚡⚡⚡ fast)
BootstrapFewShot - - Hyperparameter tuning (50+, ⚡⚡)
BootstrapFewShotWithRandomSearch - - Optuna integration (50+, ⚡⚡)
BootstrapFewShotWithOptuna - - Prompt optimization (50+, ⚡⚡, ⭐⭐⭐⭐)
COPRO - - Multi-stage instruction (100+, ⚡, ⭐⭐⭐⭐⭐)
MIPRO - - Enhanced MIPRO (200+, ⚡, ⭐⭐⭐⭐⭐)
MIPROv2 - - Collaborative optimization (100+, ⚡⚡)
BetterTogether - - Ensemble methods (100+, ⚡, ⭐⭐⭐⭐)
Ensemble - - KNN-based selection (100+, ⚡⚡, ⭐⭐⭐⭐)
KNNFewShot - - Labeled examples (50+, ⚡⚡⚡)
LabeledFewShot - - Signature tuning (100+, ⚡⚡)
SignatureOptimizer
4 Adapter Types:
- - Chat model integration
ChatAdapter - - JSON output formatting
JSONAdapter - - Function calling
FunctionAdapter - - Image input handling
ImageAdapter
Built-in Metrics:
- Accuracy (classification tasks)
- F1 Score (multi-label classification)
- ROUGE-L (text generation quality)
- BLEU (translation quality)
- Exact Match (strict comparison)
- Semantic Similarity (embedding-based)
- Custom metrics (user-defined)
10种Predictor类型:
- - 基础预测器
Predict - - 思维链推理
ChainOfThought - - 带提示的思维链
ChainOfThoughtWithHint - - 用于数学计算的代码执行
ProgramOfThought - - 具备工具调用的推理+行动Agent
ReAct - - 多链对比
MultiChainComparison - - 文档检索
Retrieve - - 类型约束输出
TypedPredictor - - 多预测器投票
Ensemble - - 多数投票聚合
majority
11种Optimizer类型:
- - 基于示例(10-50个示例,⚡⚡⚡ 快速)
BootstrapFewShot - - 超参数调优(50+示例,⚡⚡)
BootstrapFewShotWithRandomSearch - - 集成Optuna(50+示例,⚡⚡)
BootstrapFewShotWithOptuna - - 提示词优化(50+示例,⚡⚡,⭐⭐⭐⭐)
COPRO - - 多阶段指令(100+示例,⚡,⭐⭐⭐⭐⭐)
MIPRO - - 增强版MIPRO(200+示例,⚡,⭐⭐⭐⭐⭐)
MIPROv2 - - 协同优化(100+示例,⚡⚡)
BetterTogether - - 集成方法(100+示例,⚡,⭐⭐⭐⭐)
Ensemble - - 基于KNN的示例选择(100+示例,⚡⚡,⭐⭐⭐⭐)
KNNFewShot - - 带标签示例(50+示例,⚡⚡⚡)
LabeledFewShot - - Signature调优(100+示例,⚡⚡)
SignatureOptimizer
4种Adapter类型:
- - 对话模型集成
ChatAdapter - - JSON输出格式化
JSONAdapter - - 函数调用
FunctionAdapter - - 图片输入处理
ImageAdapter
内置指标:
- 准确率(分类任务)
- F1分数(多标签分类)
- ROUGE-L(文本生成质量)
- BLEU(翻译质量)
- 精确匹配(严格对比)
- 语义相似度(基于嵌入)
- 自定义指标(用户定义)
2. GEPA Integration
2. GEPA集成
Genetic-Evolutionary Prompt Architecture for automatic prompt optimization:
python
from dspy.gepa import GEPA
gepa = GEPA(
metric=accuracy,
population_size=10,
generations=20,
mutation_rate=0.3,
crossover_rate=0.7
)
result = gepa.optimize(
seed_prompt="question -> answer",
training_examples=trainset,
budget=100 # Max LLM calls
)GEPA Workflow:
- Initialize population with prompt variants
- Evaluate each variant on training data
- Select best-performing prompts
- Crossover and mutate to create new variants
- Repeat for N generations
- Return optimized prompt
When to use GEPA:
- Prompt engineering bottleneck
- Need automatic optimization
- Have 50+ training examples
- Want to explore prompt space systematically
用于自动提示词优化的遗传进化提示架构:
python
from dspy.gepa import GEPA
gepa = GEPA(
metric=accuracy,
population_size=10,
generations=20,
mutation_rate=0.3,
crossover_rate=0.7
)
result = gepa.optimize(
seed_prompt="question -> answer",
training_examples=trainset,
budget=100 # 最大LLM调用次数
)GEPA工作流:
- 初始化 包含提示词变体的种群
- 评估 每个变体在训练数据上的表现
- 选择 表现最佳的提示词
- 交叉 和 变异 生成新变体
- 重复 N代
- 返回 优化后的提示词
何时使用GEPA:
- 提示词工程遇到瓶颈
- 需要自动优化
- 拥有50+训练示例
- 希望系统性探索提示词空间
3. Session Management
3. 会话管理
Track development across multiple sessions:
python
session = {
'id': 'session_123',
'workspace': '/path/to/project',
'created_at': '2024-01-15T10:30:00Z',
'modules': [...],
'optimizers': [...],
'datasets': [...],
'metrics': [...]
}Session features:
- Workspace tracking
- Module registry
- Optimizer history
- Dataset versioning
- Metric tracking
- Export/import state
跨多个会话追踪开发进度:
python
session = {
'id': 'session_123',
'workspace': '/path/to/project',
'created_at': '2024-01-15T10:30:00Z',
'modules': [...],
'optimizers': [...],
'datasets': [...],
'metrics': [...]
}会话功能:
- 工作区追踪
- 模块注册
- Optimizer历史记录
- 数据集版本控制
- 指标追踪
- 状态导出/导入
4. Codebase RAG Indexing
4. 代码库RAG索引
Index existing DSPy codebases for contextual assistance:
typescript
interface CodebaseIndex {
workspace: string;
indexed_at: string;
modules: Array<{
path: string;
name: string;
signature?: string;
type: string;
}>;
signatures: Array<{
path: string;
definition: string;
}>;
metrics: Array<{
path: string;
name: string;
type: MetricType;
}>;
}Indexing enables:
- Fast module discovery
- Signature lookups
- Metric finding
- Pattern detection
- Dependency analysis
为现有DSPy代码库建立索引以提供上下文辅助:
typescript
interface CodebaseIndex {
workspace: string;
indexed_at: string;
modules: Array<{
path: string;
name: string;
signature?: string;
type: string;
}>;
signatures: Array<{
path: string;
definition: string;
}>;
metrics: Array<{
path: string;
name: string;
type: MetricType;
}>;
}索引功能:
- 快速模块发现
- Signature查询
- 指标查找
- 模式检测
- 依赖分析
Two-Phase Workflow
两阶段工作流
Phase 1: Development
阶段1: 开发
Goal: Build working DSPy modules
┌──────────────┐
│ /init │ Initialize project structure
└──────┬───────┘
│
▼
┌──────────────┐
│ Design │ Define signatures and modules
└──────┬───────┘
│
▼
┌──────────────┐
│ Implement │ Write forward() methods
└──────┬───────┘
│
▼
┌──────────────┐
│ /validate │ Check correctness
└──────────────┘Commands:
- - Create new DSPy project
/init <project_name> - - Connect to existing workspace
/connect - - Generate demo from 12 templates
/demo <template> - - Validate module structure and signatures
/validate <file>
Development checklist:
- Signatures defined with clear inputs/outputs
- Modules composed from predictors
- Forward methods implemented
- Type hints added where appropriate
- Unit tests written
- Validation passed
目标: 构建可运行的DSPy模块
┌──────────────┐
│ /init │ 初始化项目结构
└──────┬───────┘
│
▼
┌──────────────┐
│ 设计 │ 定义Signature和模块
└──────┬───────┘
│
▼
┌──────────────┐
│ 实现 │ 编写forward()方法
└──────┬───────┘
│
▼
┌──────────────┐
│ /validate │ 检查正确性
└──────────────┘命令:
- - 创建新的DSPy项目
/init <project_name> - - 连接至现有工作区
/connect - - 从12个模板生成演示项目
/demo <template> - - 验证模块结构和Signature
/validate <file>
开发检查清单:
- 定义了输入/输出清晰的Signature
- 模块由Predictor组合而成
- 实现了forward方法
- 适当添加了类型提示
- 编写了单元测试
- 通过了验证
Phase 2: Optimization
阶段2: 优化
Goal: Optimize modules for production
┌──────────────┐
│ Data │ Prepare training/dev/test sets
└──────┬───────┘
│
▼
┌──────────────┐
│ Metric │ Define evaluation function
└──────┬───────┘
│
▼
┌──────────────┐
│ /optimize │ Compile with optimizer
└──────┬───────┘
│
▼
┌──────────────┐
│ Evaluate │ Test on dev/test sets
└──────┬───────┘
│
▼
┌──────────────┐
│ /export │ Save optimized program
└──────────────┘Commands:
- - Run full optimization workflow
/optimize <module> - - Evaluate on dev/test sets
/evaluate <module> - - Export to Python/JSON/YAML
/export <format>
Optimization checklist:
- Training data prepared (10+ examples)
- Metric function defined
- Optimizer selected based on data size
- Compilation completed successfully
- Dev set evaluation performed
- A/B test against baseline
- Optimized program saved
- Production deployment planned
目标: 为生产环境优化模块
┌──────────────┐
│ 数据准备 │ 准备训练/开发/测试数据集
└──────┬───────┘
│
▼
┌──────────────┐
│ 指标定义 │ 定义评估函数
└──────┬───────┘
│
▼
┌──────────────┐
│ /optimize │ 使用Optimizer编译
└──────┬───────┘
│
▼
┌──────────────┐
│ 评估 │ 在开发/测试集上测试
└──────┬───────┘
│
▼
┌──────────────┐
│ /export │ 保存优化后的程序
└──────────────┘命令:
- - 运行完整优化工作流
/optimize <module> - - 在开发/测试集上评估
/evaluate <module> - - 导出为Python/JSON/YAML格式
/export <format>
优化检查清单:
- 准备好训练数据(10+示例)
- 定义了指标函数
- 根据数据规模选择了合适的Optimizer
- 编译成功完成
- 执行了开发集评估
- 与基线版本进行了A/B测试
- 保存了优化后的程序
- 规划了生产部署
Command Reference
命令参考
/init <project_name>
/init <project_name>/init <project_name>
/init <project_name>Initialize new DSPy project with structure:
project_name/
├── modules/ # DSPy modules
├── data/ # Training/dev/test datasets
├── metrics/ # Custom metrics
├── optimized/ # Saved optimized programs
├── tests/ # Unit tests
└── config.py # ConfigurationOptions:
- - Use template (qa, rag, multi-hop, agent)
--template <name> - - Set language model (gpt-3.5-turbo, gpt-4, claude-3, etc.)
--lm <model> - - Include retrieval setup
--retrieval
初始化带有如下结构的DSPy新项目:
project_name/
├── modules/ # DSPy模块
├── data/ # 训练/开发/测试数据集
├── metrics/ # 自定义指标
├── optimized/ # 保存的优化后程序
├── tests/ # 单元测试
└── config.py # 配置文件选项:
- - 使用指定模板(qa、rag、multi-hop、agent)
--template <name> - - 设置语言模型(gpt-3.5-turbo、gpt-4、claude-3等)
--lm <model> - - 包含检索设置
--retrieval
/connect
/connect/connect
/connectConnect to existing DSPy workspace:
- Indexes codebase for RAG
- Discovers modules, signatures, metrics
- Loads configuration
- Resumes session
连接至现有DSPy工作区:
- 为RAG建立代码库索引
- 发现模块、Signature和指标
- 加载配置
- 恢复会话
/demo <template>
/demo <template>/demo <template>
/demo <template>Generate demo from 12 templates:
- - Basic question answering
simple-qa - - Retrieval-augmented generation
rag - - Multi-step reasoning
multi-hop - - Structured data extraction
typed-output - - Multi-class classification
classification - - ReAct agent with tools
agent - - Multiple predictor voting
ensemble - - Iterative refinement
self-refining - - Guided reasoning
hinted-qa - - Code generation
program-of-thought - - Multi-turn conversation
chatbot - - ETL workflow
data-pipeline
Options:
- - Include optimization example
--with-optimization - - Include unit tests
--with-tests - - Custom output path
--output <path>
从12个模板生成演示项目:
- - 基础问答
simple-qa - - 检索增强生成
rag - - 多步推理
multi-hop - - 结构化数据提取
typed-output - - 多分类
classification - - 带工具的ReAct Agent
agent - - 多预测器投票
ensemble - - 迭代优化
self-refining - - 引导式推理
hinted-qa - - 代码生成
program-of-thought - - 多轮对话
chatbot - - ETL工作流
data-pipeline
选项:
- - 包含优化示例
--with-optimization - - 包含单元测试
--with-tests - - 自定义输出路径
--output <path>
/optimize <module>
/optimize <module>/optimize <module>
/optimize <module>Run complete optimization workflow:
Steps:
- Load module from file
- Prompt for training data
- Prompt for metric function
- Suggest optimizer based on data size
- Run compilation
- Evaluate on dev set
- Display results and improvements
- Save optimized program
Options:
- - Force optimizer (bootstrap, mipro, copro, etc.)
--optimizer <type> - - Max optimization budget
--budget <N> - - Use specific metric
--metric <name> - - Don't save optimized program
--no-save - - Enable GEPA optimization
--use-gepa
运行完整优化工作流:
步骤:
- 从文件加载模块
- 提示输入训练数据
- 提示输入指标函数
- 根据数据规模推荐Optimizer
- 运行编译
- 在开发集上评估
- 展示结果与提升幅度
- 保存优化后的程序
选项:
- - 强制使用指定Optimizer(bootstrap、mipro、copro等)
--optimizer <type> - - 最大优化预算
--budget <N> - - 使用指定指标
--metric <name> - - 不保存优化后的程序
--no-save - - 启用GEPA优化
--use-gepa
/validate <file>
/validate <file>/validate <file>
/validate <file>Validate DSPy code for correctness:
Checks:
- Signature format validity
- Forward method presence
- Type annotations
- Module composition
- Metric function signature
- Dataset format
- Optimizer configuration
Returns:
typescript
{
valid: boolean;
errors: string[]; // Critical issues
warnings: string[]; // Non-critical issues
suggestions: string[]; // Improvement suggestions
}验证DSPy代码的正确性:
检查项:
- Signature格式有效性
- forward方法是否存在
- 类型注解
- 模块组合
- 指标函数Signature
- 数据集格式
- Optimizer配置
返回结果:
typescript
{
valid: boolean;
errors: string[]; // 严重问题
warnings: string[]; // 非严重问题
suggestions: string[]; // 改进建议
}/export <format>
/export <format>/export <format>
/export <format>Export module to target format:
Formats:
- - Python code with comments
python - - JSON configuration
json - - YAML configuration
yaml - - Documentation
markdown
Options:
- - Include usage examples
--include-examples - - Include test code
--include-tests - - Remove comments and whitespace
--minify
将模块导出为指定格式:
支持格式:
- - 带注释的Python代码
python - - JSON配置
json - - YAML配置
yaml - - 文档
markdown
选项:
- - 包含使用示例
--include-examples - - 包含测试代码
--include-tests - - 移除注释和空白字符
--minify
Version-Aware Guidance
版本兼容指南
DSPy 2.5+ (Recommended)
DSPy 2.5+(推荐)
Full feature support:
- All 10 predictors
- All 11 optimizers
- All 4 adapters
- GEPA integration
- Typed outputs with Pydantic
- Advanced RAG features
完整功能支持:
- 全部10种Predictor
- 全部11种Optimizer
- 全部4种Adapter
- GEPA集成
- 基于Pydantic的类型化输出
- 高级RAG功能
DSPy 2.4
DSPy 2.4
Legacy support:
- Most predictors available
- Optimizer syntax differences
- Limited GEPA support
- Basic RAG features
Migration guide:
python
undefined遗留版本支持:
- 支持大部分Predictor
- Optimizer语法存在差异
- GEPA支持有限
- 基础RAG功能
迁移指南:
python
undefinedDSPy 2.4 (old)
DSPy 2.4(旧版)
optimizer = dspy.BootstrapFewShot(
metric=metric,
max_bootstrapped_demos=4
)
optimizer = dspy.BootstrapFewShot(
metric=metric,
max_bootstrapped_demos=4
)
DSPy 2.5+ (new)
DSPy 2.5+(新版)
optimizer = dspy.BootstrapFewShot(
metric=metric,
max_bootstrapped_demos=4,
max_labeled_demos=8 # New parameter
)
undefinedoptimizer = dspy.BootstrapFewShot(
metric=metric,
max_bootstrapped_demos=4,
max_labeled_demos=8 # 新增参数
)
undefinedDSPy 2.0-2.3
DSPy 2.0-2.3
Basic support:
- Core predictors only
- Limited optimizer options
- No GEPA
- Manual RAG setup
基础支持:
- 仅支持核心Predictor
- Optimizer选项有限
- 无GEPA支持
- 需手动配置RAG
Usage Examples
使用示例
Example 1: Build Simple QA Module
示例1: 构建简单问答模块
User: Create a question-answering module with chain-of-thought reasoning
Assistant (using dspy-code):
python
import dspy
class QAModule(dspy.Module):
"""Question answering with chain of thought."""
def __init__(self):
super().__init__()
self.qa = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.qa(question=question)用户: 创建一个带思维链推理的问答模块
助手(使用dspy-code):
python
import dspy
class QAModule(dspy.Module):
"""带思维链的问答模块。"""
def __init__(self):
super().__init__()
self.qa = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.qa(question=question)Usage
使用示例
lm = dspy.OpenAI(model="gpt-3.5-turbo")
dspy.settings.configure(lm=lm)
qa = QAModule()
result = qa(question="What is DSPy?")
print(result.answer)
**Next steps**: Use `/validate` to check, then `/optimize` with training data.lm = dspy.OpenAI(model="gpt-3.5-turbo")
dspy.settings.configure(lm=lm)
qa = QAModule()
result = qa(question="什么是DSPy?")
print(result.answer)
**下一步**: 使用`/validate`检查代码,再使用`/optimize`结合训练数据进行优化。Example 2: Optimize RAG Pipeline
示例2: 优化RAG流水线
User: I have a RAG module and 100 training examples. How do I optimize it?
Assistant (using dspy-code):
python
undefined用户: 我有一个RAG模块和100个训练示例,如何优化它?
助手(使用dspy-code):
python
undefined1. Define metric
1. 定义指标
def accuracy(example, prediction, trace=None):
return example.answer.lower() in prediction.answer.lower()
def accuracy(example, prediction, trace=None):
return example.answer.lower() in prediction.answer.lower()
2. Prepare data
2. 准备数据
trainset = [
dspy.Example(question="...", answer="...").with_inputs("question")
for ... in data
]
trainset = [
dspy.Example(question="...", answer="...").with_inputs("question")
for ... in data
]
3. Choose optimizer (MIPRO for 100+ examples)
3. 选择Optimizer(100+示例推荐MIPRO)
optimizer = dspy.MIPRO(
metric=accuracy,
prompt_model=dspy.OpenAI("gpt-4"),
task_model=dspy.OpenAI("gpt-3.5-turbo"),
num_trials=20,
minibatch_size=50
)
optimizer = dspy.MIPRO(
metric=accuracy,
prompt_model=dspy.OpenAI("gpt-4"),
task_model=dspy.OpenAI("gpt-3.5-turbo"),
num_trials=20,
minibatch_size=50
)
4. Compile
4. 编译
optimized_rag = optimizer.compile(rag_module, trainset=trainset)
optimized_rag = optimizer.compile(rag_module, trainset=trainset)
5. Evaluate
5. 评估
from dspy.evaluate import Evaluate
evaluator = Evaluate(devset=devset, metric=accuracy)
score = evaluator(optimized_rag)
print(f"Accuracy: {score:.2%}")
from dspy.evaluate import Evaluate
evaluator = Evaluate(devset=devset, metric=accuracy)
score = evaluator(optimized_rag)
print(f"准确率: {score:.2%}")
6. Save
6. 保存
optimized_rag.save('rag_optimized.json')
**Expected improvement**: +30-50% accuracy with 100 examples and MIPRO.optimized_rag.save('rag_optimized.json')
**预期提升**: 使用100个示例和MIPRO优化后,准确率可提升30-50%。Example 3: Use GEPA for Prompt Optimization
示例3: 使用GEPA优化提示词
User: I want to automatically find the best prompt for my task
Assistant (using dspy-code):
python
from dspy.gepa import GEPA用户: 我想自动找到适合任务的最佳提示词
助手(使用dspy-code):
python
from dspy.gepa import GEPA1. Define task
1. 定义任务
seed_prompt = "question -> answer"
training_examples = trainset[:50]
seed_prompt = "question -> answer"
training_examples = trainset[:50]
2. Configure GEPA
2. 配置GEPA
gepa = GEPA(
metric=accuracy,
population_size=10,
generations=20,
mutation_rate=0.3,
crossover_rate=0.7
)
gepa = GEPA(
metric=accuracy,
population_size=10,
generations=20,
mutation_rate=0.3,
crossover_rate=0.7
)
3. Optimize
3. 运行优化
result = gepa.optimize(
seed_prompt=seed_prompt,
training_examples=training_examples,
budget=100 # Max 100 LLM calls
)
result = gepa.optimize(
seed_prompt=seed_prompt,
training_examples=training_examples,
budget=100 # 最大LLM调用次数
)
4. Use optimized prompt
4. 使用优化后的提示词
print(f"Best prompt: {result.best_prompt}")
print(f"Score: {result.best_score:.2%}")
print(f"最佳提示词: {result.best_prompt}")
print(f"得分: {result.best_score:.2%}")
5. Create module with optimized prompt
5. 基于优化后的提示词创建模块
class OptimizedQA(dspy.Module):
def init(self):
super().init()
self.qa = dspy.ChainOfThought(result.best_prompt)
def forward(self, question):
return self.qa(question=question)
**GEPA benefits**: Explores prompt space automatically, no manual engineering needed.
---class OptimizedQA(dspy.Module):
def init(self):
super().init()
self.qa = dspy.ChainOfThought(result.best_prompt)
def forward(self, question):
return self.qa(question=question)
**GEPA优势**: 自动探索提示词空间,无需手动工程。
---Best Practices
最佳实践
1. Start Simple
1. 从简开始
Begin with basic signatures and predictors:
python
undefined从基础的Signature和Predictor入手:
python
undefinedGood: Start simple
推荐: 从简单实现开始
self.qa = dspy.ChainOfThought("question -> answer")
self.qa = dspy.ChainOfThought("question -> answer")
Bad: Overengineering
不推荐: 过度设计
self.qa = dspy.Ensemble([
dspy.ChainOfThought(...),
dspy.ProgramOfThought(...),
dspy.ReAct(...)
]) # Too complex for iteration
undefinedself.qa = dspy.Ensemble([
dspy.ChainOfThought(...),
dspy.ProgramOfThought(...),
dspy.ReAct(...)
]) # 迭代阶段过于复杂
undefined2. Optimize Early
2. 尽早优化
Run optimization on small datasets before scaling:
python
undefined在小规模数据集上运行优化后再扩展:
python
undefinedIterate quickly with 10 examples
用10个示例快速迭代
quick_optimizer = dspy.BootstrapFewShot(metric=accuracy)
quick_test = quick_optimizer.compile(module, trainset=trainset[:10])
quick_optimizer = dspy.BootstrapFewShot(metric=accuracy)
quick_test = quick_optimizer.compile(module, trainset=trainset[:10])
Then scale to full dataset
再扩展至完整数据集
full_optimizer = dspy.MIPRO(metric=accuracy)
production = full_optimizer.compile(module, trainset=full_trainset)
undefinedfull_optimizer = dspy.MIPRO(metric=accuracy)
production = full_optimizer.compile(module, trainset=full_trainset)
undefined3. Measure Everything
3. 全程度量
Track metrics throughout development:
python
undefined在开发过程中追踪所有指标:
python
undefinedLog all predictions
记录所有预测结果
def predict_with_logging(module, input):
prediction = module(input=input)
log_prediction(input, prediction, timestamp=datetime.now())
return prediction
undefineddef predict_with_logging(module, input):
prediction = module(input=input)
log_prediction(input, prediction, timestamp=datetime.now())
return prediction
undefined4. Version Control
4. 版本控制
Save and track optimized programs:
python
undefined保存并追踪优化后的程序:
python
undefinedSave with version
带版本号保存
version = "v1.2.3"
optimized.save(f'models/qa_{version}.json')
version = "v1.2.3"
optimized.save(f'models/qa_{version}.json')
Track performance
追踪性能
performance_log = {
'version': version,
'dev_score': dev_score,
'test_score': test_score,
'optimizer': 'MIPRO',
'timestamp': datetime.now().isoformat()
}
save_performance_log(performance_log)
undefinedperformance_log = {
'version': version,
'dev_score': dev_score,
'test_score': test_score,
'optimizer': 'MIPRO',
'timestamp': datetime.now().isoformat()
}
save_performance_log(performance_log)
undefined5. Modular Design
5. 模块化设计
Keep modules focused and composable:
python
undefined保持模块聚焦且可组合:
python
undefinedGood: Single responsibility
推荐: 单一职责
class Retriever(dspy.Module):
def forward(self, query):
return self.retrieve(query)
class Generator(dspy.Module):
def forward(self, context, question):
return self.generate(context=context, question=question)
class RAG(dspy.Module):
def init(self):
self.retriever = Retriever()
self.generator = Generator()
def forward(self, question):
context = self.retriever(query=question)
return self.generator(context=context, question=question)undefinedclass Retriever(dspy.Module):
def forward(self, query):
return self.retrieve(query)
class Generator(dspy.Module):
def forward(self, context, question):
return self.generate(context=context, question=question)
class RAG(dspy.Module):
def init(self):
self.retriever = Retriever()
self.generator = Generator()
def forward(self, question):
context = self.retriever(query=question)
return self.generator(context=context, question=question)undefined6. Test Thoroughly
6. 充分测试
Unit test modules before optimization:
python
import unittest
class TestQAModule(unittest.TestCase):
def setUp(self):
self.qa = QAModule()
def test_basic_question(self):
result = self.qa(question="What is 2+2?")
self.assertIsNotNone(result.answer)
def test_complex_question(self):
result = self.qa(question="Explain quantum computing")
self.assertTrue(len(result.answer) > 50)在优化前对模块进行单元测试:
python
import unittest
class TestQAModule(unittest.TestCase):
def setUp(self):
self.qa = QAModule()
def test_basic_question(self):
result = self.qa(question="2+2等于多少?")
self.assertIsNotNone(result.answer)
def test_complex_question(self):
result = self.qa(question="解释量子计算")
self.assertTrue(len(result.answer) > 50)Troubleshooting
故障排除
Issue: Low improvement after optimization
问题: 优化后提升幅度低
Solutions:
- Increase training data size (aim for 50-200 examples)
- Try different optimizer (MIPRO, COPRO for better quality)
- Improve metric function (ensure it captures task requirements)
- Add more demonstrations ()
max_bootstrapped_demos - Use stronger teacher model (GPT-4 for optimization)
解决方案:
- 增加训练数据规模(目标50-200个示例)
- 尝试不同的Optimizer(MIPRO、COPRO质量更优)
- 优化指标函数(确保能捕捉任务需求)
- 增加演示示例数量()
max_bootstrapped_demos - 使用更强大的教师模型(优化阶段用GPT-4)
Issue: Optimization too slow
问题: 优化速度过慢
Solutions:
- Reduce or
num_trialsbudget - Use smaller training set for iteration
- Enable parallel evaluation ()
num_threads=4 - Use faster base model (gpt-3.5-turbo)
- Cache predictions to avoid redundant calls
解决方案:
- 减少或
num_trialsbudget - 使用更小的训练集进行迭代
- 启用并行评估()
num_threads=4 - 使用更快的基础模型(gpt-3.5-turbo)
- 缓存预测结果避免重复调用
Issue: Module validation fails
问题: 模块验证失败
Solutions:
- Check signature format:
"input1, input2 -> output1, output2" - Ensure forward() method exists and returns Prediction
- Add type hints:
def forward(self, input: str) -> dspy.Prediction - Verify module inheritance:
class MyModule(dspy.Module) - Check that all predictors are initialized in
__init__()
解决方案:
- 检查Signature格式:
"input1, input2 -> output1, output2" - 确保forward()方法存在并返回Prediction
- 添加类型提示:
def forward(self, input: str) -> dspy.Prediction - 验证模块继承:
class MyModule(dspy.Module) - 确保所有Predictor都在中初始化
__init__()
Issue: GEPA not improving prompts
问题: GEPA无法优化提示词
Solutions:
- Increase (try 15-20)
population_size - Run more (try 30-50)
generations - Adjust mutation rate (0.4-0.5 for exploration)
- Provide more training examples (50+ recommended)
- Ensure metric function is accurate and informative
解决方案:
- 增大(尝试15-20)
population_size - 增加(尝试30-50)
generations - 调整变异率(0.4-0.5以增强探索性)
- 提供更多训练示例(推荐50+)
- 确保指标函数准确且具备参考性
Integration with Claude Code
与Claude Code的集成
This skill provides:
- Contextual assistance - Deep DSPy knowledge available in chat
- Code generation - Generate modules from templates
- Validation - Check DSPy code for correctness
- Optimization guidance - Recommend optimizers and configurations
- Workflow management - Track sessions and progress
- Export utilities - Convert to multiple formats
When to use in conversation:
- "Create a RAG module with MIPRO optimization"
- "Validate my DSPy code"
- "What optimizer should I use for 50 examples?"
- "Generate a demo of multi-hop reasoning"
- "Export this module to JSON"
- "How do I use GEPA for prompt optimization?"
本技能提供:
- 上下文辅助 - 对话中可调用深度DSPy知识
- 代码生成 - 从模板生成模块
- 验证功能 - 检查DSPy代码正确性
- 优化指导 - 推荐Optimizer和配置
- 工作流管理 - 追踪会话与进度
- 导出工具 - 转换为多种格式
对话中使用场景:
- "创建一个带MIPRO优化的RAG模块"
- "验证我的DSPy代码"
- "50个示例应该用什么Optimizer?"
- "生成一个多跳推理的演示项目"
- "将这个模块导出为JSON"
- "如何使用GEPA进行提示词优化?"
Resources
资源
- DSPy Documentation: https://dspy-docs.vercel.app
- GitHub: https://github.com/stanfordnlp/dspy
- Paper: "DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines"
- Examples: https://github.com/stanfordnlp/dspy/tree/main/examples
- Codebase:
/Users/mikhail/Downloads/architect/dspy-code-codebase
Skill Version: 1.0.0
Last Updated: 2025-12-02
Compatible with: DSPy 2.4+
- DSPy文档: https://dspy-docs.vercel.app
- GitHub: https://github.com/stanfordnlp/dspy
- 论文: 《DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines》
- 示例: https://github.com/stanfordnlp/dspy/tree/main/examples
- 代码库:
/Users/mikhail/Downloads/architect/dspy-code-codebase
技能版本: 1.0.0
最后更新: 2025-12-02
兼容版本: DSPy 2.4+