langfuse-observability
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLangfuse Observability
Langfuse 可观测性
Instrument LLM applications with Langfuse tracing, following best practices and tailored to your use case.
使用Langfuse追踪为LLM应用程序添加观测能力,遵循最佳实践并根据你的使用场景定制方案。
When to Use
适用场景
- Setting up Langfuse in a new project
- Auditing existing Langfuse instrumentation
- Adding observability to LLM calls
- 在新项目中搭建Langfuse
- 审计现有Langfuse观测实现
- 为LLM调用添加可观测性
Workflow
工作流程
1. Assess Current State
1. 评估当前状态
Check the project:
- Is Langfuse SDK installed?
- What LLM frameworks are used? (OpenAI SDK, LangChain, LlamaIndex, Vercel AI SDK, etc.)
- Is there existing instrumentation?
No integration yet: Set up Langfuse using a framework integration if available. Integrations capture more context automatically and require less code than manual instrumentation.
Integration exists: Audit against baseline requirements below.
检查项目情况:
- 是否已安装Langfuse SDK?
- 使用了哪些LLM框架?(OpenAI SDK、LangChain、LlamaIndex、Vercel AI SDK等)
- 是否已有观测实现?
尚未集成: 如果有可用的框架集成,使用框架集成来搭建Langfuse。集成会自动捕获更多上下文,比手动实现所需代码更少。
已集成: 对照下方的基线要求进行审计。
2. Verify Baseline Requirements
2. 验证基线要求
Every trace should have these fundamentals:
| Requirement | Check | Why |
|---|---|---|
| Model name | Is the LLM model captured? | Enables model comparison and filtering |
| Token usage | Are input/output tokens tracked? | Enables automatic cost calculation |
| Good trace names | Are names descriptive? ( | Makes traces findable and filterable |
| Span hierarchy | Are multi-step operations nested properly? | Shows which step is slow or failing |
| Correct observation types | Are generations marked as generations? | Enables model-specific analytics |
| Sensitive data masked | Is PII/confidential data excluded or masked? | Prevents data leakage |
| Trace input/output | Does the trace capture the full data being processed as input, and the result as output? | Enables debugging and understanding what was processed |
Framework integrations (OpenAI, LangChain, etc.) handle model name, tokens, and observation types automatically. Prefer integrations over manual instrumentation.
每条追踪记录都应具备以下基础要素:
| 要求 | 检查项 | 原因 |
|---|---|---|
| 模型名称 | 是否捕获了LLM模型信息? | 支持模型对比与筛选 |
| Token使用量 | 是否追踪了输入/输出Token数量? | 支持自动计算成本 |
| 清晰的追踪名称 | 名称是否具有描述性?(如 | 便于追踪记录的查找与筛选 |
| 调用层级结构 | 多步骤操作是否正确嵌套? | 可定位哪个步骤缓慢或失败 |
| 正确的观测类型 | 生成结果是否标记为生成类型? | 支持模型特定的分析 |
| 敏感数据掩码 | 是否排除或掩码了PII/机密数据? | 防止数据泄露 |
| 追踪输入/输出 | 追踪是否捕获了处理的完整输入数据和结果输出? | 便于调试和理解处理过程 |
框架集成(OpenAI、LangChain等)会自动处理模型名称、Token和观测类型。优先使用集成而非手动实现。
3. Explore Traces First
3. 先探索追踪记录
Once baseline instrumentation is working, encourage the user to explore their traces in the Langfuse UI before adding more context:
"Your traces are now appearing in Langfuse. Take a look at a few of them—see what data is being captured, what's useful, and what's missing. This will help us decide what additional context to add."
This helps the user:
- Understand what they're already getting
- Form opinions about what's missing
- Ask better questions about what they need
基线观测实现正常工作后,建议用户先在Langfuse UI中探索他们的追踪记录,再添加更多上下文:
"你的追踪记录现在已出现在Langfuse中。查看几条记录,了解当前捕获的数据、有用的信息以及缺失的内容。这将帮助我们决定需要添加哪些额外上下文。"
这有助于用户:
- 了解当前已获取的信息
- 明确缺失的内容
- 提出更精准的需求
4. Discover Additional Context Needs
4. 发掘额外上下文需求
Determine what additional instrumentation would be valuable. Infer from code when possible, only ask when unclear.
Infer from code:
| If you see in code... | Infer | Suggest |
|---|---|---|
| Conversation history, chat endpoints, message arrays | Multi-turn app | |
User authentication, | User-aware app | |
| Multiple distinct endpoints/features | Multi-feature app | |
| Customer/tenant identifiers | Multi-tenant app | |
| Feedback collection, ratings | Has user feedback | Capture as scores |
Only ask when not obvious from code:
- "How do you know when a response is good vs bad?" → Determines scoring approach
- "What would you want to filter by in a dashboard?" → Surfaces non-obvious tags
- "Are there different user segments you'd want to compare?" → Customer tiers, plans, etc.
Additions and their value:
| Addition | Why | Docs |
|---|---|---|
| Groups conversations together | https://langfuse.com/docs/tracing-features/sessions |
| Enables user filtering and cost attribution | https://langfuse.com/docs/tracing-features/users |
| User feedback score | Enables quality filtering and trends | https://langfuse.com/docs/scores/overview |
| Per-feature analytics | https://langfuse.com/docs/tracing-features/tags |
| Cost/quality breakdown by segment | https://langfuse.com/docs/tracing-features/tags |
These are NOT baseline requirements—only add what's relevant based on inference or user input.
确定哪些额外的观测实现有价值。尽可能从代码中推断,仅在不明确时询问用户。
从代码中推断:
| 如果在代码中看到... | 推断结论 | 建议 |
|---|---|---|
| 对话历史、聊天端点、消息数组 | 多轮对话应用 | 添加 |
用户认证、 | 感知用户的应用 | 在追踪记录中添加 |
| 多个不同的端点/功能 | 多功能应用 | 添加 |
| 客户/租户标识符 | 多租户应用 | 添加 |
| 反馈收集、评分功能 | 具备用户反馈机制 | 捕获为评分 |
仅在代码中不明确时询问:
- "你如何判断响应的好坏?" → 确定评分方式
- "你希望在仪表盘中筛选哪些内容?" → 发现非显性标签需求
- "是否有不同的用户群体需要对比?" → 客户层级、套餐等
额外添加项及其价值:
| 添加项 | 价值 | 文档 |
|---|---|---|
| 将同一场对话的消息分组 | https://langfuse.com/docs/tracing-features/sessions |
| 支持用户筛选与成本归因 | https://langfuse.com/docs/tracing-features/users |
| 用户反馈评分 | 支持质量筛选与趋势分析 | https://langfuse.com/docs/scores/overview |
| 按功能维度分析 | https://langfuse.com/docs/tracing-features/tags |
| 按群体细分成本/质量 | https://langfuse.com/docs/tracing-features/tags |
这些并非基线要求——仅根据推断或用户输入添加相关内容。
5. Guide to UI
5. 引导使用UI
After adding context, point users to relevant UI features:
- Traces view: See individual requests
- Sessions view: See grouped conversations (if session_id added)
- Dashboard: Build filtered views using tags
- Scores: Filter by quality metrics
添加上下文后,引导用户使用相关UI功能:
- 追踪记录视图:查看单个请求
- 会话视图:查看分组的对话(如果添加了session_id)
- 仪表板:使用标签构建筛选视图
- 评分:按质量指标筛选
Framework Integrations
框架集成
Prefer these over manual instrumentation:
| Framework | Integration | Docs |
|---|---|---|
| OpenAI SDK | Drop-in replacement | https://langfuse.com/docs/integrations/openai |
| LangChain | Callback handler | https://langfuse.com/docs/integrations/langchain |
| LlamaIndex | Callback handler | https://langfuse.com/docs/integrations/llama-index |
| Vercel AI SDK | OpenTelemetry exporter | https://langfuse.com/docs/integrations/vercel-ai-sdk |
| LiteLLM | Callback or proxy | https://langfuse.com/docs/integrations/litellm |
Full list: https://langfuse.com/docs/integrations
优先使用框架集成而非手动实现:
| 框架 | 集成方式 | 文档 |
|---|---|---|
| OpenAI SDK | 直接替换使用 | https://langfuse.com/docs/integrations/openai |
| LangChain | 回调处理器 | https://langfuse.com/docs/integrations/langchain |
| LlamaIndex | 回调处理器 | https://langfuse.com/docs/integrations/llama-index |
| Vercel AI SDK | OpenTelemetry 导出器 | https://langfuse.com/docs/integrations/vercel-ai-sdk |
| LiteLLM | 回调或代理 | https://langfuse.com/docs/integrations/litellm |
Always Explain Why
始终解释原因
When suggesting additions, explain the user benefit:
"I recommend adding session_id to your traces.
Why: This groups messages from the same conversation together.
You'll be able to see full conversation flows in the Sessions view,
making it much easier to debug multi-turn interactions.
Learn more: https://langfuse.com/docs/tracing-features/sessions"当建议添加内容时,向用户说明益处:
"我建议在追踪记录中添加session_id。
原因:它会将同一场对话的消息分组在一起。
你可以在会话视图中查看完整的对话流程,
这会让调试多轮交互变得更加容易。
了解更多:https://langfuse.com/docs/tracing-features/sessions"Common Mistakes
常见错误
| Mistake | Problem | Fix |
|---|---|---|
No | Traces never sent | Call |
| Flat traces | Can't see which step failed | Use nested spans for distinct steps |
| Generic trace names | Hard to filter | Use descriptive names: |
| Logging sensitive data | Data leakage risk | Mask PII before tracing |
| Manual instrumentation when integration exists | More code, less context | Use framework integration |
| Langfuse import before env vars loaded | Langfuse initializes with missing/wrong credentials | Import Langfuse AFTER loading environment variables (e.g., after |
| Wrong import order with OpenAI | Langfuse can't patch the OpenAI client | Import Langfuse and call its setup BEFORE importing OpenAI client |
| 错误 | 问题 | 修复方案 |
|---|---|---|
脚本中未调用 | 追踪记录从未发送 | 在退出前调用 |
| 扁平的追踪结构 | 无法定位失败步骤 | 为不同步骤使用嵌套调用层级 |
| 通用的追踪名称 | 难以筛选 | 使用描述性名称: |
| 记录敏感数据 | 存在数据泄露风险 | 在追踪前对PII进行掩码处理 |
| 已有集成却手动实现 | 代码更多,上下文更少 | 使用框架集成 |
| 加载环境变量前导入Langfuse | Langfuse初始化时缺少/错误的凭证 | 在加载环境变量(如 |
| OpenAI导入顺序错误 | Langfuse无法修补OpenAI客户端 | 在导入OpenAI客户端前,先导入Langfuse并完成设置 |