aiconfig-ai-metrics
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAI Metrics Instrumentation
AI指标埋点
You're using a skill that wires LaunchDarkly AI metrics around an existing provider call. Your job is to audit what's already there, pick the right tier from the ladder below, and implement it with the least ceremony that still captures the metrics the Monitoring tab needs (duration, input/output tokens, success/error, plus TTFT when streaming).
The single most important thing to get right: default to the highest tier that fits the shape of the call. Going lower ("just write the manual tracker calls") looks flexible but costs you drift, missed metrics, and legacy patterns the SDKs have moved past.
你正在使用一项技能,为现有的提供商调用添加LaunchDarkly AI指标追踪。你的任务是审核现有代码,从下方的阶梯中选择合适的层级,并以最简洁的方式实现,同时确保捕获监控标签所需的指标(时长、输入/输出令牌数、成功/错误状态,以及流式传输时的TTFT)。
最重要的原则:优先选择最贴合调用形态的最高层级。选择更低层级(“直接编写手动追踪调用”)看似灵活,但会导致代码漂移、指标遗漏,以及使用SDK已淘汰的旧模式。
The four-tier ladder
四层阶梯
This is the order the official SDK READMEs (Python core, Node core, and every provider package) recommend. Walk from the top and stop at the first tier that fits:
| Tier | Pattern | Use when | Tracks automatically |
|---|---|---|---|
| 1 — Managed runner | Python: | The call is conversational (chat history, turn-based). This is what the provider READMEs lead with. | Duration, tokens, success/error — all of it, zero tracker calls. |
2 — Provider package + | | The shape isn't a chat loop (one-shot completion, structured output, agent step) but the framework or provider has a package. | Duration + success/error from the wrapper; tokens from the package's built-in |
3 — Custom extractor + | Same | No provider package exists (Anthropic direct, Gemini, Cohere, custom HTTP). | Duration + success/error from the wrapper; tokens from your extractor. |
| 4 — Raw manual | Separate calls to | Streaming with TTFT, unusual response shapes, partial tracking, anything Tier 2–3 can't cleanly wrap. | Only what you explicitly call — it's on you to not miss one. |
A call to / / / / is Tier-2 legacy shorthand. These helpers still exist in the SDK source but none of the current provider READMEs use them — they've been superseded by + . Do not recommend them for new code; if you see them in an existing codebase, leave them alone unless the user is already on a cleanup pass.
track_openai_metricstrackOpenAIMetricstrack_bedrock_converse_metricstrackBedrockConverseMetricstrackVercelAISDKGenerateTextMetricstrackMetricsOfProvider.getAIMetricsFromResponse这是官方SDK README文档(Python核心库、Node核心库及所有提供商包)推荐的顺序。从顶层开始,找到第一个符合条件的层级即可停止:
| 层级 | 模式 | 适用场景 | 自动追踪内容 |
|---|---|---|---|
| 1 — 托管运行器 | Python: | 调用为对话式(聊天历史、回合制交互)。这是提供商README文档中的首选方案。 | 时长、令牌数、成功/错误状态 — 全部自动追踪,无需编写任何追踪调用。 |
2 — 提供商包 + | | 调用形态非聊天循环(一次性完成、结构化输出、智能体步骤),但对应框架或提供商有现成包可用。 | 包装器自动捕获时长+成功/错误状态;令牌数由包内置的 |
3 — 自定义提取器 + | 同样使用 | 无对应提供商包可用(直接调用Anthropic、Gemini、Cohere,或自定义HTTP请求)。 | 包装器自动捕获时长+成功/错误状态;令牌数由自定义提取器获取。 |
| 4 — 原生手动实现 | 分别调用 | 流式传输需追踪TTFT、响应形态特殊、仅需部分追踪,或第2-3层无法完美适配的场景。 | 仅追踪你显式调用的内容 — 需自行确保无遗漏。 |
调用 / / / / 属于第2层的遗留简写方式。这些辅助方法仍存在于SDK源码中,但当前所有提供商README文档均不再使用 — 已被 + 替代。不建议在新代码中使用;若现有代码库中存在这些方法,除非用户正在进行清理,否则无需修改。
track_openai_metricstrackOpenAIMetricstrack_bedrock_converse_metricstrackBedrockConverseMetricstrackVercelAISDKGenerateTextMetricstrackMetricsOfProvider.getAIMetricsFromResponseWorkflow
工作流程
1. Explore the existing call site
1. 探查现有调用站点
Before picking a tier, find the provider call and answer these questions:
- Shape? Is it a chat loop (history + turn-based), a one-shot completion, an agent step, or something else? → drives Tier 1 vs 2.
- Framework? Raw provider SDK? LangChain / LangGraph? Vercel AI SDK? CrewAI? → drives which Tier-2 provider package (if any) applies.
- Provider? OpenAI, Anthropic, Bedrock, Gemini, Azure, custom HTTP? → cross-reference with the package availability matrix below.
- Streaming? If yes, you'll need TTFT tracking, which means Tier 4 for the TTFT part even if the rest is Tier 2.
- Language? Python or Node? Provider-package coverage differs between them.
- Already using an AI Config? If not, route to first — tracking requires a tracker, which comes from
aiconfig-create/completion_config()/completionConfig().initChat()
选择层级前,先找到提供商调用并回答以下问题:
- 调用形态? 是聊天循环(历史记录+回合制)、一次性完成、智能体步骤,还是其他类型?→ 决定使用第1层还是第2层。
- 使用框架? 原生提供商SDK?LangChain / LangGraph?Vercel AI SDK?CrewAI?→ 决定适用的第2层提供商包(若存在)。
- 服务提供商? OpenAI、Anthropic、Bedrock、Gemini、Azure,还是自定义HTTP?→ 对照下方的包可用性矩阵。
- 是否流式传输?若是,需追踪TTFT,这意味着即使其他部分使用第2层,TTFT部分仍需使用第4层。
- 开发语言? Python还是Node?提供商包的支持情况因语言而异。
- 是否已使用AI Config?若未使用,先执行 — 追踪功能需要追踪器,而追踪器来自
aiconfig-create/completion_config()/completionConfig()。initChat()
2. Look up your Tier-2 option
2. 查询第2层可选方案
Use this matrix to decide whether Tier 2 (provider package) is available for your situation. If it's not, drop to Tier 3 (custom extractor). If the shape is chat-loop, go to Tier 1 first regardless of what's in this matrix.
| Framework / provider | Python provider package | Node provider package | Reference |
|---|---|---|---|
| OpenAI (direct SDK) | | | openai-tracking.md |
| LangChain / LangGraph | | | (use the LangChain provider docs) |
| Vercel AI SDK | — | | (use the Vercel provider docs) |
| AWS Bedrock (Converse or InvokeModel) | — (use LangChain-aws or custom extractor) | — (use LangChain-aws or custom extractor) | bedrock-tracking.md |
| Anthropic direct SDK | — | — | anthropic-tracking.md |
| Gemini / Google GenAI | — | — | Tier 3 custom extractor |
| Cohere, Mistral, custom HTTP | — | — | Tier 3 custom extractor |
| Any provider, streaming + TTFT | — (Tier 4 only) | | streaming-tracking.md |
使用以下矩阵判断第2层(提供商包)是否适用于你的场景。若不可用,则降级到第3层(自定义提取器)。若调用形态为聊天循环,无论矩阵内容如何,优先选择第1层。
| 框架/提供商 | Python提供商包 | Node提供商包 | 参考文档 |
|---|---|---|---|
| OpenAI(直接SDK) | | | openai-tracking.md |
| LangChain / LangGraph | | | 参考LangChain提供商文档 |
| Vercel AI SDK | — | | 参考Vercel提供商文档 |
| AWS Bedrock(Converse或InvokeModel) | —(使用LangChain-aws或自定义提取器) | —(使用LangChain-aws或自定义提取器) | bedrock-tracking.md |
| Anthropic直接SDK | — | — | anthropic-tracking.md |
| Gemini / Google GenAI | — | — | 第3层自定义提取器 |
| Cohere、Mistral、自定义HTTP | — | — | 第3层自定义提取器 |
| 任意提供商,流式传输+TTFT | —(仅第4层) | | streaming-tracking.md |
3. Implement from the matching reference
3. 对照参考文档实现
Once you know the tier and the provider, open the reference file and follow the pattern. The references are written so Tier 1 is always the first example, Tier 2/3 next, and Tier 4 last. Stop at the first tier that matches the app's shape.
Guardrails that apply to every tier:
- Always check before making the tracked call. A disabled config means the user has flagged the feature off — you should short-circuit to whatever fallback the app uses (cached response, error, degraded path) rather than making the provider call at all.
config.enabled - Wrap the existing call, don't rewrite it. Tier 2 and Tier 3 are designed to slot around an unmodified provider call. If you find yourself rewriting the call to fit the tracker, you're at the wrong tier — drop down one.
- Errors go through the tracker too. handles the success path; errors still need an explicit
trackMetricsOfin the catch block (or a try/except around the whole thing). Tier 1 handles both paths automatically.tracker.trackError() - Flush in short-lived processes. In serverless, cron jobs, CLI scripts — anything that exits quickly — call (sync or await) before the process terminates, or the tracker events never leave the machine.
ldClient.flush()
确定层级和提供商后,打开对应参考文档并遵循示例模式。参考文档中第1层始终为第一个示例,第2/3层次之,第4层最后。找到第一个匹配应用形态的层级后停止。
适用于所有层级的规则:
- 调用追踪前始终检查 。若配置已禁用,说明用户已关闭该功能 — 应直接跳转到应用的 fallback 逻辑(缓存响应、错误提示、降级路径),而非发起提供商调用。
config.enabled - 包装现有调用,而非重写。第2层和第3层设计用于直接包裹未修改的提供商调用。若你需要重写调用以适配追踪器,说明当前层级不合适 — 应降级一层。
- 错误也需通过追踪器处理。处理成功路径;错误仍需在catch块中显式调用
trackMetricsOf(或在整个调用外包裹try/except)。第1层会自动处理成功和错误路径。tracker.trackError() - 短生命周期进程中需调用flush。在Serverless、定时任务、CLI脚本等会快速退出的进程中,需在进程终止前调用 (同步或异步),否则追踪事件无法上传。
ldClient.flush()
4. Verify
4. 验证
Confirm the Monitoring tab fills in:
- Run one real request through the instrumented path.
- Open the AI Config in LaunchDarkly → Monitoring tab. Duration, token counts, and generation counts should appear within 1–2 minutes.
- Force an error (bad API key, zero , whatever) and confirm the error count increments.
max_tokens - If streaming: verify TTFT appears. If it doesn't, you probably wrapped the stream creation with but didn't add the manual
trackMetricsOfcall — see streaming-tracking.md.trackTimeToFirstToken
确认监控标签已填充:
- 通过埋点路径发起一次真实请求。
- 打开LaunchDarkly中的AI Config → 监控标签。1-2分钟内应显示时长、令牌数和生成次数。
- 触发一次错误(无效API密钥、设为0等),确认错误计数增加。
max_tokens - 若为流式传输:验证TTFT已显示。若未显示,可能是你用 包裹了流创建,但未添加手动
trackMetricsOf调用 — 参考streaming-tracking.md。trackTimeToFirstToken
Quick reference: tracker methods
快速参考:追踪器方法
The tracker object ( / ) provides these methods. This is the raw API surface — most of the time you should not call the individual methods, you should use or a Tier-1 managed runner. The list is here so you can recognize the methods in existing code and reach for the right one when you genuinely need Tier 4.
config.trackeraiConfig.trackertrackMetricsOf| Method (Python ↔ Node) | Tier | What it does |
|---|---|---|
| 2 / 3 | Wraps a provider call, captures duration + success/error, calls your extractor for tokens. This is the default generic tracker. |
| 2 / 3 | Async variant of the above. |
| 2 / 3 | Streaming variant. Captures per-chunk usage when the extractor handles chunks. Does not auto-capture TTFT. |
| 4 | Record latency in milliseconds. |
| 4 | Wraps a callable and records duration automatically. Does not capture tokens or success — pair with explicit calls. |
| 4 | Record token usage. |
| 4 | Record TTFT for streaming responses. |
| 4 | Mark the generation as successful. Required for the Monitoring tab to count it. |
| 4 | Mark the generation as failed. Do not also call |
| any | Record thumbs-up / thumbs-down from a feedback UI. Independent of the success/error path. |
| legacy | Predates provider packages. Still works; do not use in new code. Replace with |
| legacy | Same story. Do not use in new code. |
| legacy | Same story. Use |
追踪器对象( / )提供以下方法。这是原始API接口 — 大多数情况下你不应直接调用单个方法,而应使用 或第1层托管运行器。列出这些方法是为了让你能识别现有代码中的方法,并在确实需要第4层时选择正确的方法。
config.trackeraiConfig.trackertrackMetricsOf| 方法(Python ↔ Node) | 层级 | 功能 |
|---|---|---|
| 2 / 3 | 包裹提供商调用,捕获时长+成功/错误状态,调用提取器获取令牌数。这是默认的通用追踪器。 |
| 2 / 3 | 上述方法的异步版本。 |
| 2 / 3 | 流式传输版本。当提取器支持处理分片时,捕获分片级使用情况。不会自动捕获TTFT。 |
| 4 | 记录延迟(毫秒)。 |
| 4 | 包裹可调用对象并自动记录时长。不捕获令牌数或成功状态 — 需配合显式调用。 |
| 4 | 记录令牌使用量。 |
| 4 | 记录流式响应的TTFT。 |
| 4 | 将生成标记为成功。监控标签计数需调用此方法。 |
| 4 | 将生成标记为失败。同一请求中请勿同时调用 |
| 任意层级 | 记录反馈UI中的点赞/点踩。独立于成功/错误路径。 |
| 遗留方法 | 早于提供商包出现。仍可使用;但不建议在新代码中使用。替换为 |
| 遗留方法 | 同上。不建议在新代码中使用。 |
| 遗留方法 | 同上。使用 |
Related skills
相关技能
- — prerequisite if the app doesn't have an AI Config yet
aiconfig-create - — business metrics (conversion, resolution, retention) layered on top of the AI metrics this skill captures
aiconfig-custom-metrics - — automatic quality scoring (LLM-as-judge) on sampled live requests; complementary to the metrics here
aiconfig-online-evals - — Stage 4 of the hardcoded-to-AI-Configs migration delegates to this skill
aiconfig-migrate
- — 若应用尚未配置AI Config,此为前置技能
aiconfig-create - — 在本技能捕获的AI指标基础上,添加业务指标(转化、解决率、留存率)
aiconfig-custom-metrics - — 对抽样实时请求进行自动质量评分(LLM作为评审);与本技能的指标互补
aiconfig-online-evals - — 从硬编码到AI Config迁移的第4阶段依赖此技能
aiconfig-migrate