guidance

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Guidance: Constrained LLM Generation

Guidance:LLM约束生成

When to Use This Skill

何时使用该工具

Use Guidance when you need to:
  • Control LLM output syntax with regex or grammars
  • Guarantee valid JSON/XML/code generation
  • Reduce latency vs traditional prompting approaches
  • Enforce structured formats (dates, emails, IDs, etc.)
  • Build multi-step workflows with Pythonic control flow
  • Prevent invalid outputs through grammatical constraints
GitHub Stars: 18,000+ | From: Microsoft Research
在以下场景中使用Guidance:
  • 使用正则表达式或语法控制LLM输出语法
  • 保证生成有效的JSON/XML/代码
  • 相比传统提示词方法降低延迟
  • 强制结构化格式(日期、邮箱、ID等)
  • 使用Python风格的控制流构建多步骤工作流
  • 通过语法约束避免无效输出
GitHub星标数:18000+ | 开发方:Microsoft Research

Installation

安装

bash
undefined
bash
undefined

Base installation

基础安装

pip install guidance
pip install guidance

With specific backends

安装特定后端

pip install guidance[transformers] # Hugging Face models pip install guidance[llama_cpp] # llama.cpp models
undefined
pip install guidance[transformers] # Hugging Face模型 pip install guidance[llama_cpp] # llama.cpp模型
undefined

Quick Start

快速开始

Basic Example: Structured Generation

基础示例:结构化生成

python
from guidance import models, gen
python
from guidance import models, gen

Load model (supports OpenAI, Transformers, llama.cpp)

加载模型(支持OpenAI、Transformers、llama.cpp)

lm = models.OpenAI("gpt-4")
lm = models.OpenAI("gpt-4")

Generate with constraints

带约束的生成

result = lm + "The capital of France is " + gen("capital", max_tokens=5)
print(result["capital"]) # "Paris"
undefined
result = lm + "The capital of France is " + gen("capital", max_tokens=5)
print(result["capital"]) # "Paris"
undefined

With Anthropic Claude

搭配Anthropic Claude使用

python
from guidance import models, gen, system, user, assistant
python
from guidance import models, gen, system, user, assistant

Configure Claude

配置Claude

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = models.Anthropic("claude-sonnet-4-5-20250929")

Use context managers for chat format

使用上下文管理器实现聊天格式

with system(): lm += "You are a helpful assistant."
with user(): lm += "What is the capital of France?"
with assistant(): lm += gen(max_tokens=20)
undefined
with system(): lm += "You are a helpful assistant."
with user(): lm += "What is the capital of France?"
with assistant(): lm += gen(max_tokens=20)
undefined

Core Concepts

核心概念

1. Context Managers

1. 上下文管理器

Guidance uses Pythonic context managers for chat-style interactions.
python
from guidance import system, user, assistant, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")
Guidance使用Python风格的上下文管理器实现聊天式交互。
python
from guidance import system, user, assistant, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

System message

系统消息

with system(): lm += "You are a JSON generation expert."
with system(): lm += "You are a JSON generation expert."

User message

用户消息

with user(): lm += "Generate a person object with name and age."
with user(): lm += "Generate a person object with name and age."

Assistant response

助手回复

with assistant(): lm += gen("response", max_tokens=100)
print(lm["response"])

**Benefits:**
- Natural chat flow
- Clear role separation
- Easy to read and maintain
with assistant(): lm += gen("response", max_tokens=100)
print(lm["response"])

**优势:**
- 自然的聊天流程
- 清晰的角色划分
- 易于阅读和维护

2. Constrained Generation

2. 约束生成

Guidance ensures outputs match specified patterns using regex or grammars.
Guidance通过正则表达式或语法确保输出符合指定模式。

Regex Constraints

正则表达式约束

python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")
python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Constrain to valid email format

约束为有效的邮箱格式

lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}")
lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}")

Constrain to date format (YYYY-MM-DD)

约束为日期格式(YYYY-MM-DD)

lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")
lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")

Constrain to phone number

约束为电话号码

lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")
print(lm["email"]) # Guaranteed valid email print(lm["date"]) # Guaranteed YYYY-MM-DD format

**How it works:**
- Regex converted to grammar at token level
- Invalid tokens filtered during generation
- Model can only produce matching outputs
lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")
print(lm["email"]) # 保证是有效的邮箱 print(lm["date"]) # 保证是YYYY-MM-DD格式

**工作原理:**
- 正则表达式在token级别转换为语法
- 生成过程中过滤无效token
- 模型只能生成符合匹配规则的输出

Selection Constraints

选择约束

python
from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")
python
from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Constrain to specific choices

约束为特定选项

lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")

Multiple-choice selection

选择题式选择

lm += "Best answer: " + select( ["A) Paris", "B) London", "C) Berlin", "D) Madrid"], name="answer" )
print(lm["sentiment"]) # One of: positive, negative, neutral print(lm["answer"]) # One of: A, B, C, or D
undefined
lm += "Best answer: " + select( ["A) Paris", "B) London", "C) Berlin", "D) Madrid"], name="answer" )
print(lm["sentiment"]) # 结果为:positive、negative、neutral中的一个 print(lm["answer"]) # 结果为:A、B、C或D中的一个
undefined

3. Token Healing

3. Token修复

Guidance automatically "heals" token boundaries between prompt and generation.
Problem: Tokenization creates unnatural boundaries.
python
undefined
Guidance会自动修复提示词与生成内容之间的token边界问题。
问题: 分词会导致不自然的边界。
python
undefined

Without token healing

未使用Token修复

prompt = "The capital of France is "
prompt = "The capital of France is "

Last token: " is "

最后一个token:" is "

First generated token might be " Par" (with leading space)

生成的第一个token可能是" Par"(带前导空格)

Result: "The capital of France is Paris" (double space!)

结果:"The capital of France is Paris"(出现双空格!)


**Solution:** Guidance backs up one token and regenerates.

```python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

**解决方案:** Guidance会回退一个token并重新生成。

```python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Token healing enabled by default

Token修复默认开启

lm += "The capital of France is " + gen("capital", max_tokens=5)
lm += "The capital of France is " + gen("capital", max_tokens=5)

Result: "The capital of France is Paris" (correct spacing)

结果:"The capital of France is Paris"(空格正确)


**Benefits:**
- Natural text boundaries
- No awkward spacing issues
- Better model performance (sees natural token sequences)

**优势:**
- 自然的文本边界
- 无尴尬的空格问题
- 模型性能更优(能识别自然的token序列)

4. Grammar-Based Generation

4. 基于语法的生成

Define complex structures using context-free grammars.
python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")
使用上下文无关语法定义复杂结构。
python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

JSON grammar (simplified)

JSON语法(简化版)

json_grammar = """ { "name": <gen name regex="[A-Za-z ]+" max_tokens=20>, "age": <gen age regex="[0-9]+" max_tokens=3>, "email": <gen email regex="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}" max_tokens=50> } """
json_grammar = """ { "name": <gen name regex="[A-Za-z ]+" max_tokens=20>, "age": <gen age regex="[0-9]+" max_tokens=3>, "email": <gen email regex="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}" max_tokens=50> } """

Generate valid JSON

生成有效的JSON

lm += gen("person", grammar=json_grammar)
print(lm["person"]) # Guaranteed valid JSON structure

**Use cases:**
- Complex structured outputs
- Nested data structures
- Programming language syntax
- Domain-specific languages
lm += gen("person", grammar=json_grammar)
print(lm["person"]) # 保证是有效的JSON结构

**适用场景:**
- 复杂的结构化输出
- 嵌套数据结构
- 编程语言语法
- 领域特定语言

5. Guidance Functions

5. Guidance函数

Create reusable generation patterns with the
@guidance
decorator.
python
from guidance import guidance, gen, models

@guidance
def generate_person(lm):
    """Generate a person with name and age."""
    lm += "Name: " + gen("name", max_tokens=20, stop="\n")
    lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
    return lm
使用
@guidance
装饰器创建可复用的生成模式。
python
from guidance import guidance, gen, models

@guidance
def generate_person(lm):
    """生成包含姓名和年龄的人物信息。"""
    lm += "Name: " + gen("name", max_tokens=20, stop="\n")
    lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
    return lm

Use the function

使用该函数

lm = models.Anthropic("claude-sonnet-4-5-20250929") lm = generate_person(lm)
print(lm["name"]) print(lm["age"])

**Stateful Functions:**

```python
@guidance(stateless=False)
def react_agent(lm, question, tools, max_rounds=5):
    """ReAct agent with tool use."""
    lm += f"Question: {question}\n\n"

    for i in range(max_rounds):
        # Thought
        lm += f"Thought {i+1}: " + gen("thought", stop="\n")

        # Action
        lm += "\nAction: " + select(list(tools.keys()), name="action")

        # Execute tool
        tool_result = tools[lm["action"]]()
        lm += f"\nObservation: {tool_result}\n\n"

        # Check if done
        lm += "Done? " + select(["Yes", "No"], name="done")
        if lm["done"] == "Yes":
            break

    # Final answer
    lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
    return lm
lm = models.Anthropic("claude-sonnet-4-5-20250929") lm = generate_person(lm)
print(lm["name"]) print(lm["age"])

**有状态函数:**

```python
@guidance(stateless=False)
def react_agent(lm, question, tools, max_rounds=5):
    """带工具调用的ReAct智能体。"""
    lm += f"Question: {question}\n\n"

    for i in range(max_rounds):
        # 思考环节
        lm += f"Thought {i+1}: " + gen("thought", stop="\n")

        # 行动环节
        lm += "\nAction: " + select(list(tools.keys()), name="action")

        # 执行工具
        tool_result = tools[lm["action"]]()
        lm += f"\nObservation: {tool_result}\n\n"

        # 检查是否完成
        lm += "Done? " + select(["Yes", "No"], name="done")
        if lm["done"] == "Yes":
            break

    # 最终答案
    lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
    return lm

Backend Configuration

后端配置

Anthropic Claude

Anthropic Claude

python
from guidance import models

lm = models.Anthropic(
    model="claude-sonnet-4-5-20250929",
    api_key="your-api-key"  # Or set ANTHROPIC_API_KEY env var
)
python
from guidance import models

lm = models.Anthropic(
    model="claude-sonnet-4-5-20250929",
    api_key="your-api-key"  # 或设置ANTHROPIC_API_KEY环境变量
)

OpenAI

OpenAI

python
lm = models.OpenAI(
    model="gpt-4o-mini",
    api_key="your-api-key"  # Or set OPENAI_API_KEY env var
)
python
lm = models.OpenAI(
    model="gpt-4o-mini",
    api_key="your-api-key"  # 或设置OPENAI_API_KEY环境变量
)

Local Models (Transformers)

本地模型(Transformers)

python
from guidance.models import Transformers

lm = Transformers(
    "microsoft/Phi-4-mini-instruct",
    device="cuda"  # Or "cpu"
)
python
from guidance.models import Transformers

lm = Transformers(
    "microsoft/Phi-4-mini-instruct",
    device="cuda"  # 或"cpu"
)

Local Models (llama.cpp)

本地模型(llama.cpp)

python
from guidance.models import LlamaCpp

lm = LlamaCpp(
    model_path="/path/to/model.gguf",
    n_ctx=4096,
    n_gpu_layers=35
)
python
from guidance.models import LlamaCpp

lm = LlamaCpp(
    model_path="/path/to/model.gguf",
    n_ctx=4096,
    n_gpu_layers=35
)

Common Patterns

常见模式

Pattern 1: JSON Generation

模式1:JSON生成

python
from guidance import models, gen, system, user, assistant

lm = models.Anthropic("claude-sonnet-4-5-20250929")

with system():
    lm += "You generate valid JSON."

with user():
    lm += "Generate a user profile with name, age, and email."

with assistant():
    lm += """{
    "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
    "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
    "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
}"""

print(lm)  # Valid JSON guaranteed
python
from guidance import models, gen, system, user, assistant

lm = models.Anthropic("claude-sonnet-4-5-20250929")

with system():
    lm += "You generate valid JSON."

with user():
    lm += "Generate a user profile with name, age, and email."

with assistant():
    lm += """{
    "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
    "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
    "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
}"""

print(lm)  # 保证是有效的JSON

Pattern 2: Classification

模式2:分类

python
from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

text = "This product is amazing! I love it."

lm += f"Text: {text}\n"
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"

print(f"Sentiment: {lm['sentiment']}")
print(f"Confidence: {lm['confidence']}%")
python
from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

text = "This product is amazing! I love it."

lm += f"Text: {text}\n"
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"

print(f"Sentiment: {lm['sentiment']}")
print(f"Confidence: {lm['confidence']}%")

Pattern 3: Multi-Step Reasoning

模式3:多步骤推理

python
from guidance import models, gen, guidance

@guidance
def chain_of_thought(lm, question):
    """Generate answer with step-by-step reasoning."""
    lm += f"Question: {question}\n\n"

    # Generate multiple reasoning steps
    for i in range(3):
        lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"

    # Final answer
    lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = chain_of_thought(lm, "What is 15% of 200?")

print(lm["answer"])
python
from guidance import models, gen, guidance

@guidance
def chain_of_thought(lm, question):
    """生成带逐步推理过程的答案。"""
    lm += f"Question: {question}\n\n"

    # 生成多个推理步骤
    for i in range(3):
        lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"

    # 最终答案
    lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = chain_of_thought(lm, "What is 15% of 200?")

print(lm["answer"])

Pattern 4: ReAct Agent

模式4:ReAct智能体

python
from guidance import models, gen, select, guidance

@guidance(stateless=False)
def react_agent(lm, question):
    """ReAct agent with tool use."""
    tools = {
        "calculator": lambda expr: eval(expr),
        "search": lambda query: f"Search results for: {query}",
    }

    lm += f"Question: {question}\n\n"

    for round in range(5):
        # Thought
        lm += f"Thought: " + gen("thought", stop="\n") + "\n"

        # Action selection
        lm += "Action: " + select(["calculator", "search", "answer"], name="action")

        if lm["action"] == "answer":
            lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
            break

        # Action input
        lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"

        # Execute tool
        if lm["action"] in tools:
            result = tools[lm["action"]](lm["action_input"])
            lm += f"Observation: {result}\n\n"

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = react_agent(lm, "What is 25 * 4 + 10?")
print(lm["answer"])
python
from guidance import models, gen, select, guidance

@guidance(stateless=False)
def react_agent(lm, question):
    """带工具调用的ReAct智能体。"""
    tools = {
        "calculator": lambda expr: eval(expr),
        "search": lambda query: f"Search results for: {query}",
    }

    lm += f"Question: {question}\n\n"

    for round in range(5):
        # 思考环节
        lm += f"Thought: " + gen("thought", stop="\n") + "\n"

        # 行动选择
        lm += "Action: " + select(["calculator", "search", "answer"], name="action")

        if lm["action"] == "answer":
            lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
            break

        # 行动输入
        lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"

        # 执行工具
        if lm["action"] in tools:
            result = tools[lm["action"]](lm["action_input"])
            lm += f"Observation: {result}\n\n"

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = react_agent(lm, "What is 25 * 4 + 10?")
print(lm["answer"])

Pattern 5: Data Extraction

模式5:数据提取

python
from guidance import models, gen, guidance

@guidance
def extract_entities(lm, text):
    """Extract structured entities from text."""
    lm += f"Text: {text}\n\n"

    # Extract person
    lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"

    # Extract organization
    lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"

    # Extract date
    lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"

    # Extract location
    lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"

    return lm

text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = extract_entities(lm, text)

print(f"Person: {lm['person']}")
print(f"Organization: {lm['organization']}")
print(f"Date: {lm['date']}")
print(f"Location: {lm['location']}")
python
from guidance import models, gen, guidance

@guidance
def extract_entities(lm, text):
    """从文本中提取结构化实体。"""
    lm += f"Text: {text}\n\n"

    # 提取人物
    lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"

    # 提取组织
    lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"

    # 提取日期
    lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"

    # 提取地点
    lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"

    return lm

text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = extract_entities(lm, text)

print(f"Person: {lm['person']}")
print(f"Organization: {lm['organization']}")
print(f"Date: {lm['date']}")
print(f"Location: {lm['location']}")

Best Practices

最佳实践

1. Use Regex for Format Validation

1. 使用正则表达式进行格式验证

python
undefined
python
undefined

✅ Good: Regex ensures valid format

✅ 推荐:正则表达式保证格式有效

lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}")
lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}")

❌ Bad: Free generation may produce invalid emails

❌ 不推荐:自由生成可能产生无效邮箱

lm += "Email: " + gen("email", max_tokens=50)
undefined
lm += "Email: " + gen("email", max_tokens=50)
undefined

2. Use select() for Fixed Categories

2. 使用select()处理固定分类

python
undefined
python
undefined

✅ Good: Guaranteed valid category

✅ 推荐:保证分类有效

lm += "Status: " + select(["pending", "approved", "rejected"], name="status")
lm += "Status: " + select(["pending", "approved", "rejected"], name="status")

❌ Bad: May generate typos or invalid values

❌ 不推荐:可能产生拼写错误或无效值

lm += "Status: " + gen("status", max_tokens=20)
undefined
lm += "Status: " + gen("status", max_tokens=20)
undefined

3. Leverage Token Healing

3. 利用Token修复

python
undefined
python
undefined

Token healing is enabled by default

Token修复默认开启

No special action needed - just concatenate naturally

无需额外操作,直接自然拼接即可

lm += "The capital is " + gen("capital") # Automatic healing
undefined
lm += "The capital is " + gen("capital") # 自动修复
undefined

4. Use stop Sequences

4. 使用停止序列

python
undefined
python
undefined

✅ Good: Stop at newline for single-line outputs

✅ 推荐:针对单行输出,在换行符处停止

lm += "Name: " + gen("name", stop="\n")
lm += "Name: " + gen("name", stop="\n")

❌ Bad: May generate multiple lines

❌ 不推荐:可能生成多行内容

lm += "Name: " + gen("name", max_tokens=50)
undefined
lm += "Name: " + gen("name", max_tokens=50)
undefined

5. Create Reusable Functions

5. 创建可复用函数

python
undefined
python
undefined

✅ Good: Reusable pattern

✅ 推荐:可复用的模式

@guidance def generate_person(lm): lm += "Name: " + gen("name", stop="\n") lm += "\nAge: " + gen("age", regex=r"[0-9]+") return lm
@guidance def generate_person(lm): lm += "Name: " + gen("name", stop="\n") lm += "\nAge: " + gen("age", regex=r"[0-9]+") return lm

Use multiple times

多次使用

lm = generate_person(lm) lm += "\n\n" lm = generate_person(lm)
undefined
lm = generate_person(lm) lm += "\n\n" lm = generate_person(lm)
undefined

6. Balance Constraints

6. 平衡约束强度

python
undefined
python
undefined

✅ Good: Reasonable constraints

✅ 推荐:合理的约束

lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)
lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)

❌ Too strict: May fail or be very slow

❌ 约束过严:可能生成失败或速度极慢

lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)
undefined
lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)
undefined

Comparison to Alternatives

与同类工具对比

FeatureGuidanceInstructorOutlinesLMQL
Regex Constraints✅ Yes❌ No✅ Yes✅ Yes
Grammar Support✅ CFG❌ No✅ CFG✅ CFG
Pydantic Validation❌ No✅ Yes✅ Yes❌ No
Token Healing✅ Yes❌ No✅ Yes❌ No
Local Models✅ Yes⚠️ Limited✅ Yes✅ Yes
API Models✅ Yes✅ Yes⚠️ Limited✅ Yes
Pythonic Syntax✅ Yes✅ Yes✅ Yes❌ SQL-like
Learning CurveLowLowMediumHigh
When to choose Guidance:
  • Need regex/grammar constraints
  • Want token healing
  • Building complex workflows with control flow
  • Using local models (Transformers, llama.cpp)
  • Prefer Pythonic syntax
When to choose alternatives:
  • Instructor: Need Pydantic validation with automatic retrying
  • Outlines: Need JSON schema validation
  • LMQL: Prefer declarative query syntax
特性GuidanceInstructorOutlinesLMQL
正则表达式约束✅ 支持❌ 不支持✅ 支持✅ 支持
语法支持✅ 上下文无关语法❌ 不支持✅ 上下文无关语法✅ 上下文无关语法
Pydantic验证❌ 不支持✅ 支持✅ 支持❌ 不支持
Token修复✅ 支持❌ 不支持✅ 支持❌ 不支持
本地模型✅ 支持⚠️ 有限支持✅ 支持✅ 支持
API模型✅ 支持✅ 支持⚠️ 有限支持✅ 支持
Python风格语法✅ 支持✅ 支持✅ 支持❌ SQL风格
学习曲线中等
何时选择Guidance:
  • 需要正则表达式/语法约束
  • 想要Token修复功能
  • 构建带控制流的复杂工作流
  • 使用本地模型(Transformers、llama.cpp)
  • 偏好Python风格语法
何时选择其他工具:
  • Instructor:需要Pydantic验证和自动重试
  • Outlines:需要JSON Schema验证
  • LMQL:偏好声明式查询语法

Performance Characteristics

性能特点

Latency Reduction:
  • 30-50% faster than traditional prompting for constrained outputs
  • Token healing reduces unnecessary regeneration
  • Grammar constraints prevent invalid token generation
Memory Usage:
  • Minimal overhead vs unconstrained generation
  • Grammar compilation cached after first use
  • Efficient token filtering at inference time
Token Efficiency:
  • Prevents wasted tokens on invalid outputs
  • No need for retry loops
  • Direct path to valid outputs
延迟降低:
  • 相比传统提示词方法,约束输出的速度提升30-50%
  • Token修复减少不必要的重新生成
  • 语法约束避免无效token生成
内存占用:
  • 相比无约束生成,额外开销极小
  • 语法编译在首次使用后会被缓存
  • 推理阶段的token过滤效率高
Token效率:
  • 避免在无效输出上浪费token
  • 无需重试循环
  • 直接生成有效输出

Resources

资源

See Also

相关链接

  • references/constraints.md
    - Comprehensive regex and grammar patterns
  • references/backends.md
    - Backend-specific configuration
  • references/examples.md
    - Production-ready examples
  • references/constraints.md
    - 全面的正则表达式和语法模式
  • references/backends.md
    - 后端特定配置
  • references/examples.md
    - 生产环境可用的示例