guidance

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Guidance: Constrained LLM Generation

Guidance：LLM约束生成

When to Use This Skill

何时使用该工具

Use Guidance when you need to:

Control LLM output syntax with regex or grammars
Guarantee valid JSON/XML/code generation
Reduce latency vs traditional prompting approaches
Enforce structured formats (dates, emails, IDs, etc.)
Build multi-step workflows with Pythonic control flow
Prevent invalid outputs through grammatical constraints

GitHub Stars: 18,000+ | From: Microsoft Research

在以下场景中使用Guidance：

使用正则表达式或语法控制LLM输出语法
保证生成有效的JSON/XML/代码
相比传统提示词方法降低延迟
强制结构化格式（日期、邮箱、ID等）
使用Python风格的控制流构建多步骤工作流
通过语法约束避免无效输出

GitHub星标数：18000+ | 开发方：Microsoft Research

Installation

安装

bash

undefined

bash

undefined

Base installation

基础安装

pip install guidance

With specific backends

安装特定后端

pip install guidance[transformers] # Hugging Face models pip install guidance[llama_cpp] # llama.cpp models

undefined

pip install guidance[transformers] # Hugging Face模型 pip install guidance[llama_cpp] # llama.cpp模型

undefined

Quick Start

快速开始

Basic Example: Structured Generation

基础示例：结构化生成

python

from guidance import models, gen

python

from guidance import models, gen

Load model (supports OpenAI, Transformers, llama.cpp)

加载模型（支持OpenAI、Transformers、llama.cpp）

lm = models.OpenAI("gpt-4")

Generate with constraints

带约束的生成

result = lm + "The capital of France is " + gen("capital", max_tokens=5)

print(result["capital"]) # "Paris"

undefined

result = lm + "The capital of France is " + gen("capital", max_tokens=5)

print(result["capital"]) # "Paris"

undefined

With Anthropic Claude

搭配Anthropic Claude使用

python

from guidance import models, gen, system, user, assistant

python

from guidance import models, gen, system, user, assistant

Configure Claude

配置Claude

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Use context managers for chat format

使用上下文管理器实现聊天格式

with system(): lm += "You are a helpful assistant."

with user(): lm += "What is the capital of France?"

with assistant(): lm += gen(max_tokens=20)

undefined

with system(): lm += "You are a helpful assistant."

with user(): lm += "What is the capital of France?"

with assistant(): lm += gen(max_tokens=20)

undefined

Core Concepts

核心概念

1. Context Managers

1. 上下文管理器

Guidance uses Pythonic context managers for chat-style interactions.

python

from guidance import system, user, assistant, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Guidance使用Python风格的上下文管理器实现聊天式交互。

python

from guidance import system, user, assistant, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

System message

系统消息

with system(): lm += "You are a JSON generation expert."

User message

用户消息

with user(): lm += "Generate a person object with name and age."

Assistant response

助手回复

with assistant(): lm += gen("response", max_tokens=100)

print(lm["response"])


**Benefits:**
- Natural chat flow
- Clear role separation
- Easy to read and maintain

with assistant(): lm += gen("response", max_tokens=100)

print(lm["response"])


**优势：**
- 自然的聊天流程
- 清晰的角色划分
- 易于阅读和维护

2. Constrained Generation

2. 约束生成

Guidance ensures outputs match specified patterns using regex or grammars.

Guidance通过正则表达式或语法确保输出符合指定模式。

Regex Constraints

正则表达式约束

python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Constrain to valid email format

约束为有效的邮箱格式

lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}")

Constrain to date format (YYYY-MM-DD)

约束为日期格式（YYYY-MM-DD）

lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")

Constrain to phone number

约束为电话号码

lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")

print(lm["email"]) # Guaranteed valid email print(lm["date"]) # Guaranteed YYYY-MM-DD format


**How it works:**
- Regex converted to grammar at token level
- Invalid tokens filtered during generation
- Model can only produce matching outputs

lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")

print(lm["email"]) # 保证是有效的邮箱 print(lm["date"]) # 保证是YYYY-MM-DD格式


**工作原理：**
- 正则表达式在token级别转换为语法
- 生成过程中过滤无效token
- 模型只能生成符合匹配规则的输出

Selection Constraints

选择约束

python

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

python

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Constrain to specific choices

约束为特定选项

lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")

Multiple-choice selection

选择题式选择

lm += "Best answer: " + select( ["A) Paris", "B) London", "C) Berlin", "D) Madrid"], name="answer" )

print(lm["sentiment"]) # One of: positive, negative, neutral print(lm["answer"]) # One of: A, B, C, or D

undefined

lm += "Best answer: " + select( ["A) Paris", "B) London", "C) Berlin", "D) Madrid"], name="answer" )

print(lm["sentiment"]) # 结果为：positive、negative、neutral中的一个 print(lm["answer"]) # 结果为：A、B、C或D中的一个

undefined

3. Token Healing

3. Token修复

Guidance automatically "heals" token boundaries between prompt and generation.

Problem: Tokenization creates unnatural boundaries.

python

undefined

Guidance会自动修复提示词与生成内容之间的token边界问题。

问题： 分词会导致不自然的边界。

python

undefined

Without token healing

未使用Token修复

prompt = "The capital of France is "

Last token: " is "

最后一个token：" is "

First generated token might be " Par" (with leading space)

生成的第一个token可能是" Par"（带前导空格）

Result: "The capital of France is Paris" (double space!)

结果："The capital of France is Paris"（出现双空格！）


**Solution:** Guidance backs up one token and regenerates.

```python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")


**解决方案：** Guidance会回退一个token并重新生成。

```python
from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

Token healing enabled by default

Token修复默认开启

lm += "The capital of France is " + gen("capital", max_tokens=5)

Result: "The capital of France is Paris" (correct spacing)

结果："The capital of France is Paris"（空格正确）


**Benefits:**
- Natural text boundaries
- No awkward spacing issues
- Better model performance (sees natural token sequences)


**优势：**
- 自然的文本边界
- 无尴尬的空格问题
- 模型性能更优（能识别自然的token序列）

4. Grammar-Based Generation

4. 基于语法的生成

Define complex structures using context-free grammars.

python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

使用上下文无关语法定义复杂结构。

python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

JSON grammar (simplified)

JSON语法（简化版）

json_grammar = """ { "name": <gen name regex="[A-Za-z ]+" max_tokens=20>, "age": <gen age regex="[0-9]+" max_tokens=3>, "email": <gen email regex="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}" max_tokens=50> } """

Generate valid JSON

生成有效的JSON

lm += gen("person", grammar=json_grammar)

print(lm["person"]) # Guaranteed valid JSON structure


**Use cases:**
- Complex structured outputs
- Nested data structures
- Programming language syntax
- Domain-specific languages

lm += gen("person", grammar=json_grammar)

print(lm["person"]) # 保证是有效的JSON结构


**适用场景：**
- 复杂的结构化输出
- 嵌套数据结构
- 编程语言语法
- 领域特定语言

5. Guidance Functions

5. Guidance函数

Create reusable generation patterns with the

@guidance

decorator.

python

from guidance import guidance, gen, models

@guidance
def generate_person(lm):
    """Generate a person with name and age."""
    lm += "Name: " + gen("name", max_tokens=20, stop="\n")
    lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
    return lm

使用

@guidance

装饰器创建可复用的生成模式。

python

from guidance import guidance, gen, models

@guidance
def generate_person(lm):
    """生成包含姓名和年龄的人物信息。"""
    lm += "Name: " + gen("name", max_tokens=20, stop="\n")
    lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
    return lm

Use the function

使用该函数

lm = models.Anthropic("claude-sonnet-4-5-20250929") lm = generate_person(lm)

print(lm["name"]) print(lm["age"])


**Stateful Functions:**

```python
@guidance(stateless=False)
def react_agent(lm, question, tools, max_rounds=5):
    """ReAct agent with tool use."""
    lm += f"Question: {question}\n\n"

    for i in range(max_rounds):
        # Thought
        lm += f"Thought {i+1}: " + gen("thought", stop="\n")

        # Action
        lm += "\nAction: " + select(list(tools.keys()), name="action")

        # Execute tool
        tool_result = tools[lm["action"]]()
        lm += f"\nObservation: {tool_result}\n\n"

        # Check if done
        lm += "Done? " + select(["Yes", "No"], name="done")
        if lm["done"] == "Yes":
            break

    # Final answer
    lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929") lm = generate_person(lm)

print(lm["name"]) print(lm["age"])


**有状态函数：**

```python
@guidance(stateless=False)
def react_agent(lm, question, tools, max_rounds=5):
    """带工具调用的ReAct智能体。"""
    lm += f"Question: {question}\n\n"

    for i in range(max_rounds):
        # 思考环节
        lm += f"Thought {i+1}: " + gen("thought", stop="\n")

        # 行动环节
        lm += "\nAction: " + select(list(tools.keys()), name="action")

        # 执行工具
        tool_result = tools[lm["action"]]()
        lm += f"\nObservation: {tool_result}\n\n"

        # 检查是否完成
        lm += "Done? " + select(["Yes", "No"], name="done")
        if lm["done"] == "Yes":
            break

    # 最终答案
    lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
    return lm

Backend Configuration

后端配置

Anthropic Claude

python

from guidance import models

lm = models.Anthropic(
    model="claude-sonnet-4-5-20250929",
    api_key="your-api-key"  # Or set ANTHROPIC_API_KEY env var
)

python

from guidance import models

lm = models.Anthropic(
    model="claude-sonnet-4-5-20250929",
    api_key="your-api-key"  # 或设置ANTHROPIC_API_KEY环境变量
)

OpenAI

python

lm = models.OpenAI(
    model="gpt-4o-mini",
    api_key="your-api-key"  # Or set OPENAI_API_KEY env var
)

python

lm = models.OpenAI(
    model="gpt-4o-mini",
    api_key="your-api-key"  # 或设置OPENAI_API_KEY环境变量
)

Local Models (Transformers)

本地模型（Transformers）

python

from guidance.models import Transformers

lm = Transformers(
    "microsoft/Phi-4-mini-instruct",
    device="cuda"  # Or "cpu"
)

python

from guidance.models import Transformers

lm = Transformers(
    "microsoft/Phi-4-mini-instruct",
    device="cuda"  # 或"cpu"
)

Local Models (llama.cpp)

本地模型（llama.cpp）

python

from guidance.models import LlamaCpp

lm = LlamaCpp(
    model_path="/path/to/model.gguf",
    n_ctx=4096,
    n_gpu_layers=35
)

python

from guidance.models import LlamaCpp

lm = LlamaCpp(
    model_path="/path/to/model.gguf",
    n_ctx=4096,
    n_gpu_layers=35
)

Common Patterns

常见模式

Pattern 1: JSON Generation

模式1：JSON生成

python

from guidance import models, gen, system, user, assistant

lm = models.Anthropic("claude-sonnet-4-5-20250929")

with system():
    lm += "You generate valid JSON."

with user():
    lm += "Generate a user profile with name, age, and email."

with assistant():
    lm += """{
    "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
    "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
    "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
}"""

print(lm)  # Valid JSON guaranteed

python

from guidance import models, gen, system, user, assistant

lm = models.Anthropic("claude-sonnet-4-5-20250929")

with system():
    lm += "You generate valid JSON."

with user():
    lm += "Generate a user profile with name, age, and email."

with assistant():
    lm += """{
    "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
    "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
    "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
}"""

print(lm)  # 保证是有效的JSON

Pattern 2: Classification

模式2：分类

python

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

text = "This product is amazing! I love it."

lm += f"Text: {text}\n"
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"

print(f"Sentiment: {lm['sentiment']}")
print(f"Confidence: {lm['confidence']}%")

python

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

text = "This product is amazing! I love it."

lm += f"Text: {text}\n"
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"

print(f"Sentiment: {lm['sentiment']}")
print(f"Confidence: {lm['confidence']}%")

Pattern 3: Multi-Step Reasoning

模式3：多步骤推理

python

from guidance import models, gen, guidance

@guidance
def chain_of_thought(lm, question):
    """Generate answer with step-by-step reasoning."""
    lm += f"Question: {question}\n\n"

    # Generate multiple reasoning steps
    for i in range(3):
        lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"

    # Final answer
    lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = chain_of_thought(lm, "What is 15% of 200?")

print(lm["answer"])

python

from guidance import models, gen, guidance

@guidance
def chain_of_thought(lm, question):
    """生成带逐步推理过程的答案。"""
    lm += f"Question: {question}\n\n"

    # 生成多个推理步骤
    for i in range(3):
        lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"

    # 最终答案
    lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = chain_of_thought(lm, "What is 15% of 200?")

print(lm["answer"])

Pattern 4: ReAct Agent

模式4：ReAct智能体

python

from guidance import models, gen, select, guidance

@guidance(stateless=False)
def react_agent(lm, question):
    """ReAct agent with tool use."""
    tools = {
        "calculator": lambda expr: eval(expr),
        "search": lambda query: f"Search results for: {query}",
    }

    lm += f"Question: {question}\n\n"

    for round in range(5):
        # Thought
        lm += f"Thought: " + gen("thought", stop="\n") + "\n"

        # Action selection
        lm += "Action: " + select(["calculator", "search", "answer"], name="action")

        if lm["action"] == "answer":
            lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
            break

        # Action input
        lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"

        # Execute tool
        if lm["action"] in tools:
            result = tools[lm["action"]](lm["action_input"])
            lm += f"Observation: {result}\n\n"

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = react_agent(lm, "What is 25 * 4 + 10?")
print(lm["answer"])

python

from guidance import models, gen, select, guidance

@guidance(stateless=False)
def react_agent(lm, question):
    """带工具调用的ReAct智能体。"""
    tools = {
        "calculator": lambda expr: eval(expr),
        "search": lambda query: f"Search results for: {query}",
    }

    lm += f"Question: {question}\n\n"

    for round in range(5):
        # 思考环节
        lm += f"Thought: " + gen("thought", stop="\n") + "\n"

        # 行动选择
        lm += "Action: " + select(["calculator", "search", "answer"], name="action")

        if lm["action"] == "answer":
            lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
            break

        # 行动输入
        lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"

        # 执行工具
        if lm["action"] in tools:
            result = tools[lm["action"]](lm["action_input"])
            lm += f"Observation: {result}\n\n"

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = react_agent(lm, "What is 25 * 4 + 10?")
print(lm["answer"])

Pattern 5: Data Extraction

模式5：数据提取

python

from guidance import models, gen, guidance

@guidance
def extract_entities(lm, text):
    """Extract structured entities from text."""
    lm += f"Text: {text}\n\n"

    # Extract person
    lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"

    # Extract organization
    lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"

    # Extract date
    lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"

    # Extract location
    lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"

    return lm

text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = extract_entities(lm, text)

print(f"Person: {lm['person']}")
print(f"Organization: {lm['organization']}")
print(f"Date: {lm['date']}")
print(f"Location: {lm['location']}")

python

from guidance import models, gen, guidance

@guidance
def extract_entities(lm, text):
    """从文本中提取结构化实体。"""
    lm += f"Text: {text}\n\n"

    # 提取人物
    lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"

    # 提取组织
    lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"

    # 提取日期
    lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"

    # 提取地点
    lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"

    return lm

text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = extract_entities(lm, text)

print(f"Person: {lm['person']}")
print(f"Organization: {lm['organization']}")
print(f"Date: {lm['date']}")
print(f"Location: {lm['location']}")

Best Practices

最佳实践

1. Use Regex for Format Validation

1. 使用正则表达式进行格式验证

python

undefined

python

undefined

✅ Good: Regex ensures valid format

✅ 推荐：正则表达式保证格式有效

lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}")

❌ Bad: Free generation may produce invalid emails

❌ 不推荐：自由生成可能产生无效邮箱

lm += "Email: " + gen("email", max_tokens=50)

undefined

lm += "Email: " + gen("email", max_tokens=50)

undefined

2. Use select() for Fixed Categories

2. 使用select()处理固定分类

python

undefined

python

undefined

✅ Good: Guaranteed valid category

✅ 推荐：保证分类有效

lm += "Status: " + select(["pending", "approved", "rejected"], name="status")

❌ Bad: May generate typos or invalid values

❌ 不推荐：可能产生拼写错误或无效值

lm += "Status: " + gen("status", max_tokens=20)

undefined

lm += "Status: " + gen("status", max_tokens=20)

undefined

3. Leverage Token Healing

3. 利用Token修复

python

undefined

python

undefined

Token healing is enabled by default

Token修复默认开启

No special action needed - just concatenate naturally

无需额外操作，直接自然拼接即可

lm += "The capital is " + gen("capital") # Automatic healing

undefined

lm += "The capital is " + gen("capital") # 自动修复

undefined

4. Use stop Sequences

4. 使用停止序列

python

undefined

python

undefined

✅ Good: Stop at newline for single-line outputs

✅ 推荐：针对单行输出，在换行符处停止

lm += "Name: " + gen("name", stop="\n")

❌ Bad: May generate multiple lines

❌ 不推荐：可能生成多行内容

lm += "Name: " + gen("name", max_tokens=50)

undefined

lm += "Name: " + gen("name", max_tokens=50)

undefined

5. Create Reusable Functions

5. 创建可复用函数

python

undefined

python

undefined

✅ Good: Reusable pattern

✅ 推荐：可复用的模式

@guidance def generate_person(lm): lm += "Name: " + gen("name", stop="\n") lm += "\nAge: " + gen("age", regex=r"[0-9]+") return lm

Use multiple times

多次使用

lm = generate_person(lm) lm += "\n\n" lm = generate_person(lm)

undefined

lm = generate_person(lm) lm += "\n\n" lm = generate_person(lm)

undefined

6. Balance Constraints

6. 平衡约束强度

python

undefined

python

undefined

✅ Good: Reasonable constraints

✅ 推荐：合理的约束

lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)

❌ Too strict: May fail or be very slow

❌ 约束过严：可能生成失败或速度极慢

lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)

undefined

lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)

undefined

Comparison to Alternatives

与同类工具对比

Feature	Guidance	Instructor	Outlines	LMQL
Regex Constraints	✅ Yes	❌ No	✅ Yes	✅ Yes
Grammar Support	✅ CFG	❌ No	✅ CFG	✅ CFG
Pydantic Validation	❌ No	✅ Yes	✅ Yes	❌ No
Token Healing	✅ Yes	❌ No	✅ Yes	❌ No
Local Models	✅ Yes	⚠️ Limited	✅ Yes	✅ Yes
API Models	✅ Yes	✅ Yes	⚠️ Limited	✅ Yes
Pythonic Syntax	✅ Yes	✅ Yes	✅ Yes	❌ SQL-like
Learning Curve	Low	Low	Medium	High

When to choose Guidance:

Need regex/grammar constraints
Want token healing
Building complex workflows with control flow
Using local models (Transformers, llama.cpp)
Prefer Pythonic syntax

When to choose alternatives:

Instructor: Need Pydantic validation with automatic retrying
Outlines: Need JSON schema validation
LMQL: Prefer declarative query syntax

特性	Guidance	Instructor	Outlines	LMQL
正则表达式约束	✅ 支持	❌ 不支持	✅ 支持	✅ 支持
语法支持	✅ 上下文无关语法	❌ 不支持	✅ 上下文无关语法	✅ 上下文无关语法
Pydantic验证	❌ 不支持	✅ 支持	✅ 支持	❌ 不支持
Token修复	✅ 支持	❌ 不支持	✅ 支持	❌ 不支持
本地模型	✅ 支持	⚠️ 有限支持	✅ 支持	✅ 支持
API模型	✅ 支持	✅ 支持	⚠️ 有限支持	✅ 支持
Python风格语法	✅ 支持	✅ 支持	✅ 支持	❌ SQL风格
学习曲线	低	低	中等	高

何时选择Guidance：

需要正则表达式/语法约束
想要Token修复功能
构建带控制流的复杂工作流
使用本地模型（Transformers、llama.cpp）
偏好Python风格语法

何时选择其他工具：

Instructor：需要Pydantic验证和自动重试
Outlines：需要JSON Schema验证
LMQL：偏好声明式查询语法

Performance Characteristics

性能特点

Latency Reduction:

30-50% faster than traditional prompting for constrained outputs
Token healing reduces unnecessary regeneration
Grammar constraints prevent invalid token generation

Memory Usage:

Minimal overhead vs unconstrained generation
Grammar compilation cached after first use
Efficient token filtering at inference time

Token Efficiency:

Prevents wasted tokens on invalid outputs
No need for retry loops
Direct path to valid outputs

延迟降低：

相比传统提示词方法，约束输出的速度提升30-50%
Token修复减少不必要的重新生成
语法约束避免无效token生成

内存占用：

相比无约束生成，额外开销极小
语法编译在首次使用后会被缓存
推理阶段的token过滤效率高

Token效率：

避免在无效输出上浪费token
无需重试循环
直接生成有效输出

Resources

资源

Documentation: https://guidance.readthedocs.io
GitHub: https://github.com/guidance-ai/guidance (18k+ stars)
Notebooks: https://github.com/guidance-ai/guidance/tree/main/notebooks
Discord: Community support available

文档：https://guidance.readthedocs.io
GitHub：https://github.com/guidance-ai/guidance（18k+星标）
Notebooks：https://github.com/guidance-ai/guidance/tree/main/notebooks
Discord：提供社区支持