rlm
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBuild an RLM
构建RLM
An RLM is a callable, pre-configured agent. It autonomously explores context, writes and executes code in a sandboxed REPL, calls tools, inspects results, and iterates until the task is done. Unlike a chat agent, an RLM is a function — you define its inputs, outputs, and tools, then call it from your code. It returns structured data, not chat messages.
This skill has two phases:
- Plan — interactively define the RLM with the user, research feasibility, produce a plan
- Build — implement the plan as code files
First action: Enter plan mode using the EnterPlanMode tool.
RLM是一种可调用的预配置Agent。它能自主探索上下文,在沙箱化REPL中编写并执行代码,调用工具,检查结果,并迭代直至任务完成。与聊天Agent不同,RLM是一个函数——你定义其输入、输出和工具,然后从代码中调用它。它返回结构化数据,而非聊天消息。
该技能包含两个阶段:
- 规划 —— 与用户交互式定义RLM,研究可行性,生成规划方案
- 构建 —— 将规划方案实现为代码文件
第一步操作:使用EnterPlanMode工具进入规划模式。
Phase 1: Plan
阶段1:规划
Work through these steps interactively. Do not skip steps or rush to the plan. Each step should involve asking the user questions and confirming alignment before moving on.
按步骤交互式推进,请勿跳过步骤或急于生成方案。每一步都应先向用户提问,确认达成共识后再继续。
Step 1: Goal Definition
步骤1:目标定义
Understand what the user wants to build.
Ask:
- What is the desired outcome? What does success look like?
- What is the input material? (documents, code, data, APIs, etc.)
- What does the output look like? (structured report, modified files, spreadsheet, etc.)
Then validate RLM fit. An RLM is the right tool when:
- The input is large and needs selective exploration (documents, datasets, codebases)
- The task is multi-step with tool use (extract -> transform -> validate)
- Actions modify state (redaction, form filling, generation)
- Parallel sub-LM calls are needed across many items
- File-to-file transformations (PDFs -> spreadsheets, documents -> reports)
If the task is better served by a single LLM call or a simple script, tell the user and suggest an alternative. Otherwise, proceed.
理解用户想要构建的内容。
提问:
- 期望的成果是什么?成功的标准是什么?
- 输入材料有哪些?(文档、代码、数据、API等)
- 输出形式是什么?(结构化报告、修改后的文件、电子表格等)
然后验证RLM适配性。当满足以下条件时,RLM是合适的工具:
- 输入内容庞大,需要选择性探索(文档、数据集、代码库)
- 任务需要多步骤工具调用(提取 -> 转换 -> 验证)
- 操作会修改状态(编辑、表单填充、生成)
- 需要针对多个条目并行调用子LM
- 需要文件到文件的转换(PDF -> 电子表格、文档 -> 报告)
如果任务更适合单次LLM调用或简单脚本,请告知用户并建议替代方案。否则,继续推进。
Step 2: Input Design
步骤2:输入设计
Work with the user to define every input to the RLM.
For each input, determine:
- Name and type: ,
File,list[File], or a Pydantic modelstr - Description: what it contains and how the RLM uses it
- Source: user-provided file, API response, config, generated data
Key principles:
- Large content (PDFs, images, datasets) must be references — the RLM accesses content on-demand through skills, keeping its context small
File - Metadata (file paths, page counts, config flags) can be strings or Pydantic models
- Use for variable-count file inputs
list[File]
Confirm the input design with the user before proceeding.
与用户协作定义RLM的所有输入。
针对每个输入,确定:
- 名称和类型:、
File、list[File]或Pydantic模型str - 描述:输入包含的内容以及RLM如何使用它
- 来源:用户提供的文件、API响应、配置、生成的数据
核心原则:
- 大型内容(PDF、图片、数据集)必须使用引用——RLM通过技能按需访问内容,保持其上下文规模较小
File - 元数据(文件路径、页数、配置标志)可以是字符串或Pydantic模型
- 使用处理数量可变的文件输入
list[File]
在推进前与用户确认输入设计。
Step 3: Output Design
步骤3:输出设计
Work with the user to define the structured output.
For each output field, determine:
- Name, type, and description
- Whether it's a Pydantic model (structured data), (generated file), or primitive
File
Push for specificity — vague outputs lead to poor RLM performance. Sketch the Pydantic models with annotations. Include nested models where appropriate.
Field(description=...)Ask the user:
- What fields matter most? What would they check first?
- Are there any computed/derived fields (scores, summaries, counts)?
- Do they need output files (Excel, PDF, images)?
Confirm the output design with the user before proceeding.
与用户协作定义结构化输出。
针对每个输出字段,确定:
- 名称、类型和描述
- 它是Pydantic模型(结构化数据)、(生成的文件)还是基本类型
File
追求明确性——模糊的输出会导致RLM性能不佳。使用注解绘制Pydantic模型,必要时包含嵌套模型。
Field(description=...)向用户提问:
- 哪些字段最重要?他们会首先检查哪些内容?
- 是否存在计算/派生字段(分数、摘要、计数)?
- 是否需要输出文件(Excel、PDF、图片)?
在推进前与用户确认输出设计。
Step 4: Research
步骤4:调研
This step is autonomous. Tell the user you are researching, then do it.
Use web search and the Explore subagent to:
-
Find Python packages for the domain (e.g.,for graphs,
networkxfor code parsing,tree-sitterfor HTML).beautifulsoup4 -
Check Pyodide compatibility. The sandbox runs Pyodide (Python in WASM). Only pure-Python wheels or packages with Emscripten builds work. Search pypi.org for each package and check:
- Does it have a wheel? (pure Python — works)
py3-none-any - Does it have C extensions without Emscripten builds? (won't work in sandbox)
- Is it in the Pyodide built-in package list? (check https://pyodide.org/en/stable/usage/packages-in-pyodide.html)
- Does it have a
-
Identify network needs. Does the task require calling external APIs? If so, note the domains for.
allowed_domains -
Identify host-side tool needs. If any functionality cannot run in WASM (native binaries, C extensions, heavy computation), it must be a host-side tool — a Python function running on the host that the RLM calls like any other tool.
-
Check for existing skills. The built-in skills are:
- — pymupdf for PDF rendering, text extraction, manipulation
pdf - — openpyxl, pandas, formulas for Excel work
spreadsheet - — python-docx for reading, writing, and modifying Word documents
docx
Report findings to the user with a clear feasibility assessment. Flag any blockers.
此步骤为自主执行。告知用户你正在调研,然后开展工作。
使用网络搜索和Explore子代理完成以下工作:
-
查找领域相关Python包(例如,用于图的、用于代码解析的
networkx、用于HTML的tree-sitter)。beautifulsoup4 -
检查Pyodide兼容性。沙箱运行Pyodide(基于WASM的Python)。仅纯Python轮子或带有Emscripten构建版本的包可用。在pypi.org搜索每个包并检查:
- 是否有轮子?(纯Python——可用)
py3-none-any - 是否包含无Emscripten构建版本的C扩展?(无法在沙箱中运行)
- 是否在Pyodide内置包列表中?(查看https://pyodide.org/en/stable/usage/packages-in-pyodide.html)
- 是否有
-
识别网络需求。任务是否需要调用外部API?如果是,记录对应的域名。
allowed_domains -
识别主机端工具需求。如果某些功能无法在WASM中运行(原生二进制文件、C扩展、重型计算),则必须作为主机端工具——即在主机上运行的Python函数,RLM可像调用其他工具一样调用它。
-
检查现有技能。内置技能包括:
- —— 使用pymupdf进行PDF渲染、文本提取和操作
pdf - —— 使用openpyxl、pandas、formulas处理Excel
spreadsheet - —— 使用python-docx读取、写入和修改Word文档
docx
向用户汇报调研结果,包括清晰的可行性评估,并标记任何障碍。
Step 5: Skill Design
步骤5:技能设计
Based on research, design the skill configuration.
基于调研结果,设计技能配置。
Built-in skills
内置技能
List which built-in skills to use and why.
列出要使用的内置技能及其原因。
Custom skills (if needed)
自定义技能(如有需要)
For each custom skill, define:
- name: short identifier
- instructions: prose guidance injected into the RLM's system prompt — teaches the RLM patterns and best practices. Be detailed; this is the primary way to control RLM behavior.
- packages: PyPI packages installed in the sandbox via micropip (must be Pyodide-compatible)
- modules: Python files mounted into the sandbox as importable modules
- tools: host-side callable functions exposed to the RLM
针对每个自定义技能,定义:
- name:简短标识符
- instructions:注入RLM系统提示的指导性文本——教授RLM模式和最佳实践。内容需详细;这是控制RLM行为的主要方式。
- packages:通过micropip在沙箱中安装的PyPI包(必须兼容Pyodide)
- modules:挂载到沙箱中作为可导入模块的Python文件
- tools:暴露给RLM的主机端可调用函数
Host-side tool design
主机端工具设计
For each host-side tool:
- Function name and signature with type hints
- Docstring (the RLM sees this to understand how to call it)
- What it does and why it must be host-side
Confirm the skill design with the user before proceeding.
针对每个主机端工具:
- 带类型提示的函数名称和签名
- 文档字符串(RLM通过它理解如何调用工具)
- 工具的功能以及必须在主机端运行的原因
在推进前与用户确认技能设计。
Step 6: Strategy and Architecture
步骤6:策略与架构
Signature strategy
签名策略
Write the step-by-step strategy that goes in the signature's docstring. This is the RLM's playbook:
- What to do first (survey/understand the input)
- How to gather information (render pages, use predict() for extraction, call tools)
- How to process and synthesize
- What to produce and where to save output files
编写将放入签名文档字符串的分步策略,这是RLM的操作手册:
- 首先要做什么(调查/理解输入)
- 如何收集信息(渲染页面、使用predict()提取内容、调用工具)
- 如何处理和合成数据
- 要生成什么内容以及将输出文件保存到何处
Single vs chained RLMs
单RLM vs 链式RLM
Evaluate whether this needs one RLM or multiple chained RLMs.
Use a single RLM when:
- The task is one coherent workflow
- All steps need the same context/state
- The iteration count stays reasonable (under 40)
Use chained RLMs when:
- There are distinct phases with different skill needs
- One phase produces artifacts consumed by another
- The combined task would exceed reasonable iteration counts
- Different phases benefit from different sub-LM models
If chaining, define each stage:
- Stage name, signature (inputs/outputs), skills, strategy
- The DAG: which stage feeds into which, with typed connections
评估任务需要单个RLM还是多个链式RLM。
使用单RLM的场景:
- 任务是一个连贯的工作流
- 所有步骤需要相同的上下文/状态
- 迭代次数保持合理(少于40次)
使用链式RLM的场景:
- 存在不同阶段,各阶段需要不同技能
- 一个阶段生成的产物供另一个阶段使用
- 组合任务的迭代次数超过合理范围
- 不同阶段受益于不同的子LM模型
如果使用链式结构,定义每个阶段:
- 阶段名称、签名(输入/输出)、技能、策略
- DAG(有向无环图):哪些阶段向哪些阶段提供输入,包含类型化连接
Configuration
配置
- estimate per RLM
max_iterations - if network access is needed
allowed_domains - recommendations (capability level needed)
sub_lm
- 每个RLM的估计值
max_iterations - 如需网络访问,设置
allowed_domains - 建议(所需的能力级别)
sub_lm
Feasibility Checklist
可行性检查清单
Before producing the final plan, verify:
- All proposed packages are Pyodide-compatible (or have host-side fallbacks)
- Network access needs are identified with specific domains
- Host-side tools are defined for anything that can't run in WASM
- Iteration count is reasonable (under 50 per RLM)
- Input sizes are manageable (or chunking strategy is defined)
- Output schemas are specific enough for reliable extraction
- The task is achievable — no unsupported capabilities assumed
生成最终方案前,验证以下内容:
- 所有提议的包都兼容Pyodide(或有主机端备选方案)
- 已识别网络访问需求及具体域名
- 为无法在WASM中运行的功能定义了主机端工具
- 迭代次数合理(每个RLM少于50次)
- 输入规模可控(或已定义分块策略)
- 输出模式足够明确,以确保可靠提取
- 任务可实现——未假设不支持的功能
Plan Output
方案输出
Write the plan to the Claude Code plan file with these sections:
- Overview — one paragraph: what, why, and expected workflow
- File manifest — every file to create with a one-line description
- Input schemas — complete Pydantic model code for
schema.py - Output schemas — complete Pydantic model code for
schema.py - Signature — complete code with strategy docstring
signature.py - Skills configuration — built-in imports + custom definitions + tool signatures
Skill(...) - Service architecture — single RLM wiring or chained DAG:
Stage1(documents) --[ExtractedData]--> Stage2(extracted) --[Report]--> Stage3(report) - Feasibility notes — constraints, risks, alternatives
- Estimated complexity — iteration count, sub-LM calls, cost range, runtime
After writing the plan, use ExitPlanMode to get user approval. Once approved, proceed to Phase 2.
将方案写入Claude Code规划文件,包含以下部分:
- 概述 —— 一段文字:内容、原因和预期工作流
- 文件清单 —— 要创建的所有文件及一行描述
- 输入模式 —— 的完整Pydantic模型代码
schema.py - 输出模式 —— 的完整Pydantic模型代码
schema.py - 签名 —— 带策略文档字符串的完整代码
signature.py - 技能配置 —— 内置导入 + 自定义定义 + 工具签名
Skill(...) - 服务架构 —— 单RLM连接或链式DAG:
Stage1(documents) --[ExtractedData]--> Stage2(extracted) --[Report]--> Stage3(report) - 可行性说明 —— 约束、风险、替代方案
- 预估复杂度 —— 迭代次数、子LM调用次数、成本范围、运行时间
编写完方案后,使用ExitPlanMode获取用户批准。批准后,进入阶段2。
Phase 2: Build
阶段2:构建
Implement the approved plan. Create all files following the patterns below.
实施已批准的方案,按照以下模式创建所有文件。
File structure
文件结构
my_rlm/
├── __init__.py # Public exports (service class, schema, signature)
├── schema.py # Pydantic models for inputs AND outputs
├── signature.py # DSPy Signature (inputs/outputs + strategy docstring)
├── service.py # DSPy Module wiring signature + PredictRLM + skills
└── skills.py # (optional) Custom skill definitions beyond built-in skillsAlways create: , , ,
Create when needed: (only if the RLM needs domain-specific instructions beyond built-in skills)
schema.pysignature.pyservice.py__init__.pyskills.pymy_rlm/
├── __init__.py # 公共导出(服务类、模式、签名)
├── schema.py # 输入和输出的Pydantic模型
├── signature.py # DSPy Signature(输入/输出 + 策略文档字符串)
├── service.py # DSPy Module:连接签名 + PredictRLM + 技能
└── skills.py # (可选)除内置技能外的自定义技能定义必须创建:、、、
按需创建:(仅当RLM需要超出内置技能的领域特定指令时)
schema.pysignature.pyservice.py__init__.pyskills.pyschema.py — Pydantic models
schema.py —— Pydantic模型
Define models for structured inputs and outputs. Use so the RLM knows what each field means.
Field(description=...)python
from pydantic import BaseModel, Field
class KeyDate(BaseModel):
"""A key date extracted from a document."""
name: str = Field(description="e.g. 'Submission Deadline', 'Effective Date'")
date: str = Field(description="ISO format date (YYYY-MM-DD)")
time: str | None = Field(
None, description="24-hour format (HH:MM), e.g. '14:00', '09:30'"
)
timezone: str | None = Field(
None, description="Timezone code, e.g. 'EST', 'EDT', 'PST', 'UTC'"
)
class DocumentAnalysis(BaseModel):
"""Structured analysis of a document set."""
report: str = Field(
description="Full analysis as a well-formatted markdown report"
)
key_dates: list[KeyDate] = Field(
default_factory=list, description="Important dates found in the documents"
)为结构化输入和输出定义模型。使用让RLM了解每个字段的含义。
Field(description=...)python
from pydantic import BaseModel, Field
class KeyDate(BaseModel):
"""从文档中提取的关键日期。"""
name: str = Field(description="例如:'提交截止日期'、'生效日期'")
date: str = Field(description="ISO格式日期(YYYY-MM-DD)")
time: str | None = Field(
None, description="24小时制格式(HH:MM),例如:'14:00'、'09:30'"
)
timezone: str | None = Field(
None, description="时区代码,例如:'EST'、'EDT'、'PST'、'UTC'"
)
class DocumentAnalysis(BaseModel):
"""文档集的结构化分析结果。"""
report: str = Field(
description="格式规范的markdown格式完整分析报告"
)
key_dates: list[KeyDate] = Field(
default_factory=list, description="文档中发现的重要日期"
)signature.py — Inputs, outputs, and strategy
signature.py —— 输入、输出和策略
The docstring becomes the RLM's system instructions — tell the RLM how to approach the task step by step:
python
import dspy
from predict_rlm import File
from .schema import DocumentAnalysis
class AnalyzeDocuments(dspy.Signature):
"""Analyze documents and produce a structured report.
1. **Read the report criteria** (appended below) to understand what
information to extract and in what format.
2. **Survey the documents** to understand what you're working with:
file names, page counts, document types.
3. **Gather information** systematically by rendering pages as images
and using predict() to extract content.
4. **Produce the report** following the format specified in the criteria.
Use tables for structured data, prose for analysis and context.
"""
documents: list[File] = dspy.InputField(
desc="PDF documents to analyze"
)
analysis: DocumentAnalysis = dspy.OutputField(
desc="Structured analysis with markdown report, key dates, and key entities"
)文档字符串将成为RLM的系统指令——逐步告知RLM如何处理任务:
python
import dspy
from predict_rlm import File
from .schema import DocumentAnalysis
class AnalyzeDocuments(dspy.Signature):
"""分析文档并生成结构化报告。
1. **阅读报告标准**(附在下方),了解需要提取的信息及其格式。
2. **调查文档**,了解工作对象:文件名、页数、文档类型。
3. **系统收集信息**,将页面渲染为图片并使用predict()提取内容。
4. **生成报告**,遵循标准中指定的格式。使用表格呈现结构化数据,使用散文体进行分析和说明上下文。
"""
documents: list[File] = dspy.InputField(
desc="要分析的PDF文档"
)
analysis: DocumentAnalysis = dspy.OutputField(
desc="包含markdown报告、关键日期和关键实体的结构化分析结果"
)service.py — Wiring it together
service.py —— 整合所有组件
Wrap signature + skills + PredictRLM into a reusable DSPy Module:
python
import dspy
from predict_rlm import File, PredictRLM
from predict_rlm.skills import pdf as pdf_skill
from .schema import DocumentAnalysis
from .signature import AnalyzeDocuments
class DocumentAnalyzer(dspy.Module):
def __init__(
self,
sub_lm: dspy.LM | str | None = None,
max_iterations: int = 30,
verbose: bool = False,
debug: bool = False,
):
self.sub_lm = sub_lm
self.max_iterations = max_iterations
self.verbose = verbose
self.debug = debug
async def aforward(
self, documents: list[File], criteria: str
) -> DocumentAnalysis:
signature = AnalyzeDocuments.with_instructions(
AnalyzeDocuments.instructions + "\n\n# Task\n\n" + criteria.strip()
)
predictor = PredictRLM(
signature,
sub_lm=self.sub_lm,
skills=[pdf_skill],
max_iterations=self.max_iterations,
verbose=self.verbose,
debug=self.debug,
)
result = await predictor.acall(documents=documents)
return result.analysisWhen using multiple skills or host-side tools:
python
from predict_rlm.skills import pdf as pdf_skill
from predict_rlm.skills import spreadsheet as spreadsheet_skill
async def aforward(self, documents: list[File]) -> MyOutput:
predictor = PredictRLM(
MySignature,
sub_lm=self.sub_lm,
skills=[pdf_skill, spreadsheet_skill],
tools={"fetch_exchange_rate": fetch_exchange_rate},
...
)将签名 + 技能 + PredictRLM包装为可复用的DSPy Module:
python
import dspy
from predict_rlm import File, PredictRLM
from predict_rlm.skills import pdf as pdf_skill
from .schema import DocumentAnalysis
from .signature import AnalyzeDocuments
class DocumentAnalyzer(dspy.Module):
def __init__(
self,
sub_lm: dspy.LM | str | None = None,
max_iterations: int = 30,
verbose: bool = False,
debug: bool = False,
):
self.sub_lm = sub_lm
self.max_iterations = max_iterations
self.verbose = verbose
self.debug = debug
async def aforward(
self, documents: list[File], criteria: str
) -> DocumentAnalysis:
signature = AnalyzeDocuments.with_instructions(
AnalyzeDocuments.instructions + "\n\n# 任务\n\n" + criteria.strip()
)
predictor = PredictRLM(
signature,
sub_lm=self.sub_lm,
skills=[pdf_skill],
max_iterations=self.max_iterations,
verbose=self.verbose,
debug=self.debug,
)
result = await predictor.acall(documents=documents)
return result.analysis使用多个技能或主机端工具时:
python
from predict_rlm.skills import pdf as pdf_skill
from predict_rlm.skills import spreadsheet as spreadsheet_skill
async def aforward(self, documents: list[File]) -> MyOutput:
predictor = PredictRLM(
MySignature,
sub_lm=self.sub_lm,
skills=[pdf_skill, spreadsheet_skill],
tools={"fetch_exchange_rate": fetch_exchange_rate},
...
)Chaining pattern (multiple RLMs)
链式模式(多个RLM)
python
async def aforward(self, documents: list[File]):
# Stage 1: Extract
extractor = PredictRLM(ExtractSignature, sub_lm=self.sub_lm, skills=[pdf_skill])
extracted = await extractor.acall(documents=documents)
# Stage 2: Analyze (uses output from stage 1)
analyzer = PredictRLM(AnalyzeSignature, sub_lm=self.sub_lm, skills=[analysis_skill])
result = await analyzer.acall(data=extracted.data)
return resultpython
async def aforward(self, documents: list[File]):
# 阶段1:提取
extractor = PredictRLM(ExtractSignature, sub_lm=self.sub_lm, skills=[pdf_skill])
extracted = await extractor.acall(documents=documents)
# 阶段2:分析(使用阶段1的输出)
analyzer = PredictRLM(AnalyzeSignature, sub_lm=self.sub_lm, skills=[analysis_skill])
result = await analyzer.acall(data=extracted.data)
return resultskills.py — Custom skills
skills.py —— 自定义技能
Create only when the RLM needs domain-specific instructions beyond built-in skills.
python
from predict_rlm import Skill
from predict_rlm.skills import pdf as pdf_skill
redaction_skill = Skill(
name="redaction",
instructions="""How to redact content from PDFs using pymupdf.仅当RLM需要超出内置技能的领域特定指令时创建。
python
from predict_rlm import Skill
from predict_rlm.skills import pdf as pdf_skill
redaction_skill = Skill(
name="redaction",
instructions="""如何使用pymupdf对PDF内容进行编辑。Text redaction
文本编辑
Search for text, create redaction annotations, then apply:
page = doc[page_num]
hits = page.search_for("sensitive text")
for rect in hits:
page.add_redact_annot(rect, fill=(0, 0, 0))
page.apply_redactions()
...""",
)
all = ["pdf_skill", "redaction_skill"]
---搜索文本,创建编辑注释,然后应用:
page = doc[page_num]
hits = page.search_for("敏感文本")
for rect in hits:
page.add_redact_annot(rect, fill=(0, 0, 0))
page.apply_redactions()
...""",
)
all = ["pdf_skill", "redaction_skill"]
---Architecture Reference
架构参考
Use this reference to ensure plans and implementations are accurate. Do not hallucinate parameters or patterns.
使用此参考确保方案和实现准确无误,请勿虚构参数或模式。
How an RLM works
RLM工作原理
The architecture is two-level:
- The outer LLM (the RLM itself) writes and executes Python code in a sandboxed Pyodide/WASM REPL. It plans, orchestrates, and iterates.
- The sub-LM (via ) handles perception and extraction — analyzing images, understanding text, and returning typed results. Each
predict()call gets its own context window.predict()
The outer LLM's context stays small (code + tool results), while context-heavy work is offloaded to calls.
predict()架构分为两层:
- 外层LLM(RLM本身)在沙箱化Pyodide/WASM REPL中编写并执行Python代码。它负责规划、编排和迭代。
- 子LM(通过)处理感知和提取——分析图片、理解文本并返回类型化结果。每个
predict()调用拥有独立的上下文窗口。predict()
外层LLM的上下文保持较小规模(代码 + 工具结果),而上下文密集型工作则卸载到调用中。
predict()File I/O
文件I/O
Use for file-typed fields:
File- Input field: mounts the file from host into the sandbox at
/sandbox/input/{field_name}/ - Output field: syncs from back to the host
/sandbox/output/{field_name}/
python
from predict_rlm import File对文件类型字段使用:
File- 输入字段:将文件从主机挂载到沙箱的路径
/sandbox/input/{field_name}/ - 输出字段:从同步回主机
/sandbox/output/{field_name}/
python
from predict_rlm import FileInput: File(path="/absolute/path/to/file.pdf")
输入:File(path="/绝对路径/to/file.pdf")
Output: declared as File output field, RLM writes to /sandbox/output/<field>/
输出:声明为File输出字段,RLM写入到/sandbox/output/<field>/
undefinedundefinedPredictRLM constructor
PredictRLM构造函数
python
PredictRLM(
signature: type[Signature] | str, # DSPy signature class
lm: dspy.LM | str | None = None, # Main LM (code generation)
sub_lm: dspy.LM | str | None = None, # Sub-LM for predict() calls
max_iterations: int = 30,
max_llm_calls: int = 50,
verbose: bool = False,
tools: dict[str, Callable] | list[Callable] | None = None,
allowed_domains: list[str] | None = None,
skills: list[Skill] | None = None,
debug: bool = False,
output_dir: str | Path | None = None,
)Both and accept a model string (e.g. ) or a instance. If is omitted, the current context LM from is used.
lmsub_lm"openai/gpt-5.4"dspy.LMlmdspy.context(lm=...)python
PredictRLM(
signature: type[Signature] | str, # DSPy签名类
lm: dspy.LM | str | None = None, # 主LM(代码生成)
sub_lm: dspy.LM | str | None = None, # 用于predict()调用的子LM
max_iterations: int = 30,
max_llm_calls: int = 50,
verbose: bool = False,
tools: dict[str, Callable] | list[Callable] | None = None,
allowed_domains: list[str] | None = None,
skills: list[Skill] | None = None,
debug: bool = False,
output_dir: str | Path | None = None,
)lmsub_lm"openai/gpt-5.4"dspy.LMlmdspy.context(lm=...)Skill dataclass
Skill数据类
python
from predict_rlm import Skill
Skill(
name="my-skill", # Short identifier
instructions="How to approach...", # Prose injected into the RLM prompt
packages=["pandas", "openpyxl"], # PyPI packages installed in the sandbox
modules={"helper": "/path/to/helper.py"}, # Python files mounted as importable modules
tools={"fetch": fetch_fn}, # Host-side callable functions exposed to the RLM
)Skills can bundle host-side tools via their field. When skills are composed, their tools are merged alongside instructions and packages (tool name conflicts raise errors).
tools=python
from predict_rlm import Skill
Skill(
name="my-skill", # 简短标识符
instructions="How to approach...", # 注入RLM提示的指导性文本
packages=["pandas", "openpyxl"], # 在沙箱中安装的PyPI包
modules={"helper": "/path/to/helper.py"}, # 挂载为可导入模块的Python文件
tools={"fetch": fetch_fn}, # 暴露给RLM的主机端可调用函数
)技能可通过字段捆绑主机端工具。组合技能时,其工具会与指令和包合并(工具名称冲突会引发错误)。
tools=Built-in skills
内置技能
python
from predict_rlm.skills import pdf as pdf_skill # pymupdf
from predict_rlm.skills import spreadsheet as spreadsheet_skill # openpyxl, pandas, formulas
from predict_rlm.skills import docx as docx_skill # python-docx| Skill | Packages | Modules | What it teaches the RLM |
|---|---|---|---|
| — | Read, render, modify, and redact PDFs | |
| spreadsheet | | | Build and modify Excel workbooks with formulas and formatting |
| docx | | | Read, write, and modify Word documents with tables, formatting, and styles |
python
from predict_rlm.skills import pdf as pdf_skill # pymupdf
from predict_rlm.skills import spreadsheet as spreadsheet_skill # openpyxl, pandas, formulas
from predict_rlm.skills import docx as docx_skill # python-docx| 技能 | 包 | 模块 | 教授RLM的内容 |
|---|---|---|---|
| — | 读取、渲染、修改和编辑PDF | |
| spreadsheet | | | 使用公式和格式构建及修改Excel工作簿 |
| docx | | | 读取、写入和修改带有表格、格式和样式的Word文档 |
Tools
工具
Tools are host-side functions the RLM can call from the sandbox. Use them for operations that cannot run inside the sandbox — host access, authenticated APIs, database queries, system resources.
python
async def fetch_exchange_rate(currency: str, date: str) -> str:
"""Fetch the exchange rate for a currency on a given date.
Args:
currency: ISO currency code (e.g. "EUR", "GBP")
date: Date in YYYY-MM-DD format
Returns:
JSON string with the exchange rate data
"""
async with httpx.AsyncClient() as client:
resp = await client.get(f"https://api.example.com/rates/{currency}/{date}")
return resp.textTools can be passed directly to PredictRLM via or bundled inside a Skill via .
tools={"name": fn}tools=工具是RLM可从沙箱中调用的主机端函数。用于无法在沙箱内运行的操作——主机访问、认证API、数据库查询、系统资源。
python
async def fetch_exchange_rate(currency: str, date: str) -> str:
"""获取指定日期的货币汇率。
参数:
currency: ISO货币代码(例如"EUR"、"GBP")
date: YYYY-MM-DD格式的日期
返回:
包含汇率数据的JSON字符串
"""
async with httpx.AsyncClient() as client:
resp = await client.get(f"https://api.example.com/rates/{currency}/{date}")
return resp.text工具可通过直接传递给PredictRLM,或通过捆绑在Skill中。
tools={"name": fn}tools=When to use a Skill vs tools
使用Skill vs tools的场景对比
| Use a Skill when... | Use |
|---|---|
| The RLM needs a package installed in the sandbox | The function must run on the host (API calls, DB queries, filesystem) |
| You need to teach the RLM how to use something | The tool's docstring is self-explanatory |
| The knowledge is reusable across RLMs | It's a single specific function for one RLM |
| 使用Skill的场景... | 使用 |
|---|---|
| RLM需要在沙箱中安装包 | 函数必须在主机上运行(API调用、数据库查询、文件系统操作) |
| 需要教授RLM如何使用某功能 | 工具的文档字符串本身已清晰说明 |
| 知识可在多个RLM间复用 | 仅为单个RLM提供的特定函数 |
predict() tool (inside sandbox)
predict()工具(沙箱内)
The RLM can call for sub-LM perception/extraction:
predict()python
result = await predict(
"image: dspy.Image -> items: list[Item]",
instructions="Extract all line items from this invoice page",
image=page_image,
)Each predict() call gets its own context window. Supports for multimodal.
dspy.ImageRLM可调用进行子LM感知/提取:
predict()python
result = await predict(
"image: dspy.Image -> items: list[Item]",
instructions="从发票页面提取所有行项目",
image=page_image,
)每个predict()调用拥有独立的上下文窗口,支持用于多模态场景。
dspy.ImageKey imports
核心导入
python
from predict_rlm import PredictRLM, Skill, File
from predict_rlm.skills import pdf, spreadsheet, docxpython
from predict_rlm import PredictRLM, Skill, File
from predict_rlm.skills import pdf, spreadsheet, docxWASM sandbox constraints
WASM沙箱约束
- Only pure-Python wheels or Pyodide built-in packages work
- No subprocess, no native binaries, no C extensions (unless Emscripten-built)
- Network access requires whitelist
allowed_domains - File I/O is within the sandbox filesystem
- Host-side tools bridge the gap for anything WASM can't do
- 仅支持纯Python轮子或Pyodide内置包
- 无 subprocess、无原生二进制文件、无C扩展(除非是Emscripten构建版本)
- 网络访问需要白名单
allowed_domains - 文件I/O仅限于沙箱文件系统
- 主机端工具可弥补WASM无法实现的功能