Loading...
Loading...
Compare original and translation side by side
mlflow.trace.sessionmlflow.trace.session--extract-fieldsundefined--extract-fieldsundefinedmlflow.trace.sessionmlflow.trace.session
Find the **root span** — the span with `parent_span_id` equal to `null` (i.e., it has no parent). This is the top-level operation in the trace:
```bash
找到**根span**——即`parent_span_id`为`null`的span(没有父span)。这是trace中的顶层操作:
```bash
Examine its `attributes` dict to identify which keys hold the user input and system output. These could be:
- **MLflow standard attributes**: `mlflow.spanInputs` and `mlflow.spanOutputs` (set by the MLflow Python client)
- **Custom attributes**: Application-specific keys set via `@mlflow.trace` or `mlflow.start_span()` with custom attribute logging
- **Third-party OTel attributes**: Keys following GenAI Semantic Conventions, OpenInference, or other instrumentation conventions
The structure of these values also varies by application (e.g., a `query` string, a `messages` array, a dict with multiple fields). Inspect the actual attribute values to understand the format.
**If the root span has empty or missing inputs/outputs**, it may be a wrapper span (e.g., an orchestrator or middleware) that doesn't directly carry the chat turn data. In that case, look at its immediate children — find the closest span to the top of the hierarchy that has meaningful inputs and outputs corresponding to a chat turn:
The following example assumes the trace comes from the MLflow Python client (which stores inputs/outputs in `mlflow.spanInputs`/`mlflow.spanOutputs`) and that the relevant span is a direct child of root. In practice, the relevant span may be deeper in the hierarchy, and traces from other clients may use different attribute keys — explore the span tree as needed:
```bash
查看其`attributes`字典,确定哪些键存储了用户输入和系统输出。这些键可能是:
- **MLflow标准属性**:`mlflow.spanInputs`和`mlflow.spanOutputs`(由MLflow Python客户端设置)
- **自定义属性**:通过`@mlflow.trace`或`mlflow.start_span()`设置的应用特定键,包含自定义属性日志
- **第三方OTel属性**:遵循GenAI语义约定、OpenInference或其他插桩约定的键
这些值的结构也因应用而异(例如,`query`字符串、`messages`数组、包含多个字段的字典)。查看实际属性值以理解格式。
**如果根span的输入/输出为空或缺失**,它可能是一个包装span(例如,编排器或中间件),不直接承载聊天轮次数据。在这种情况下,查看它的直接子span——找到层级中最顶层的、包含对应聊天轮次有意义输入和输出的span:
以下示例假设trace来自MLflow Python客户端(在`mlflow.spanInputs`/`mlflow.spanOutputs`中存储输入/输出),且相关span是根span的直接子span。实际场景中,相关span可能在层级更深的位置,其他客户端的trace可能使用不同的属性键——按需探索span树:
```bash
Also check the first trace's assessments. **Session-level assessments are attached to the first trace in the session** — these evaluate the session as a whole (e.g., overall conversation quality, multi-turn coherence) and can indicate the presence of issues somewhere across the entire session, not just the first turn. The first trace may also have per-turn assessments for that specific turn.
Both types appear in `.info.assessments`. Session-level assessments are identified by the presence of `mlflow.trace.session` in their `metadata` field:
```bash
同时检查第一个trace的评估结果。**会话级评估附加在会话的第一个trace上**——这些评估针对整个会话(例如,整体对话质量、多轮连贯性),可以指示整个会话中是否存在问题,而不仅仅是第一个轮次。第一个trace也可能包含该特定轮次的单轮评估结果。
两种评估都出现在`.info.assessments`中。会话级评估的识别特征是其`metadata`字段中存在`mlflow.trace.session`:
```bash
**Assessment errors are not trace errors.** If an assessment has a `feedback.error` field, it means the scorer or judge failed — not that the trace itself has a problem. Exclude these when using assessments to identify trace issues.
**Always consult the rationale when interpreting assessment values.** The `value` alone can be misleading — for example, a `user_frustration` assessment with `value: "no"` could mean "no frustration detected" or "the frustration check did not pass" (i.e., frustration *is* present), depending on how the scorer was configured. The `.rationale` field (a top-level assessment field, **not** nested under `.feedback`) explains what the value means in context. Include rationale when extracting assessments:
```bash
jq '[.info.assessments[] | select(.feedback.error == null) | {name: .assessment_name, value: .feedback.value, rationale: .rationale}]' /tmp/trace_detail.json--extract-fieldsmlflow traces search \
--experiment-id <EXPERIMENT_ID> \
--filter-string 'metadata.`mlflow.trace.session` = "<SESSION_ID>"' \
--order-by "timestamp_ms ASC" \
--extract-fields 'info.trace_id,info.state,info.request_time,info.assessments,info.trace_metadata.`mlflow.traceInputs`,info.trace_metadata.`mlflow.traceOutputs`' \
--output json \
--max-results 100 > /tmp/session_traces.jsonjqwchead--extract-fieldsmlflow.traceInputsmlflow.traceOutputsundefined
**评估错误不是trace错误**。如果评估包含`feedback.error`字段,这意味着评分器或判断器失败——而非trace本身存在问题。使用评估识别trace问题时,请排除这些评估。
**解读评估值时务必参考理由**。仅看`value`可能产生误导——例如,`user_frustration`评估的`value: "no"`可能表示“未检测到沮丧情绪”,也可能表示“沮丧检查未通过”(即存在沮丧情绪),具体取决于评分器的配置。`.rationale`字段(评估的顶级字段,**不**嵌套在`.feedback`下)解释了该值在上下文中的含义。提取评估时请包含理由:
```bash
jq '[.info.assessments[] | select(.feedback.error == null) | {name: .assessment_name, value: .feedback.value, rationale: .rationale}]' /tmp/trace_detail.json--extract-fieldsmlflow traces search \\
--experiment-id <EXPERIMENT_ID> \\
--filter-string 'metadata.`mlflow.trace.session` = "<SESSION_ID>"' \\
--order-by "timestamp_ms ASC" \\
--extract-fields 'info.trace_id,info.state,info.request_time,info.assessments,info.trace_metadata.`mlflow.traceInputs`,info.trace_metadata.`mlflow.traceOutputs`' \\
--output json \\
--max-results 100 > /tmp/session_traces.jsonjqwchead--extract-fieldsmlflow.traceInputsmlflow.traceOutputsundefined
**CLI syntax notes:**
- **`--experiment-id` is required** for all `mlflow traces search` commands. The command will fail without it.
- Metadata keys containing dots **must** be escaped with backticks in filter strings and extract-fields: `` metadata.`mlflow.trace.session` ``
- **Shell quoting**: Backticks inside **double quotes** are interpreted by bash as command substitution (e.g., bash will try to run `` `mlflow.trace.session` `` as a command). Always use **single quotes** for the outer string when the value contains backticks. For example: `--filter-string 'metadata.\`mlflow.trace.session\` = "value"'`
- `--max-results` defaults to 100, which is sufficient for most sessions. Increase up to 500 (the maximum) for longer conversations. If 500 results are returned, use pagination to retrieve the rest.
**CLI语法说明:**
- **`--experiment-id`是必填项**,所有`mlflow traces search`命令都需要该参数。缺少该参数命令会失败。
- 包含点符号的元数据键**必须**在过滤字符串和extract-fields中用反引号转义:`` metadata.`mlflow.trace.session` ``
- **Shell引号**:双引号内的反引号会被bash解释为命令替换(例如,bash会尝试将`` `mlflow.trace.session` ``作为命令执行)。当值包含反引号时,外层字符串请始终使用单引号。例如:`--filter-string 'metadata.\\`mlflow.trace.session\\` = "value"'`
- `--max-results`默认值为100,足以处理大多数会话。对于较长的对话,可增加至最大值500。如果返回500条结果,使用分页获取剩余内容。undefinedundefined
**Never pipe MLflow CLI output directly** (e.g., `mlflow traces search ... | jq '.'`). This can silently produce no output. Always redirect to a file first, then run commands on the file.
To inspect a specific turn in detail (e.g., after identifying a problematic turn), fetch its full trace:
```bash
mlflow traces get --trace-id <TRACE_ID> > /tmp/turn_detail.json
**切勿直接管道MLflow CLI输出**(例如,`mlflow traces search ... | jq '.'`)。这可能会无提示地产生空输出。请始终先重定向到文件,然后对文件执行命令。
要详细检查特定轮次(例如,定位到有问题的轮次后),获取其完整trace:
```bash
mlflow traces get --trace-id <TRACE_ID> > /tmp/turn_detail.jsonmlflow.trace.sessionmlflow.trace.sessionscripts/scripts/discover_schema.sh <EXPERIMENT_ID> <SESSION_ID>scripts/inspect_turn.sh <TRACE_ID>scripts/scripts/discover_schema.sh <EXPERIMENT_ID> <SESSION_ID>scripts/inspect_turn.sh <TRACE_ID>mlflow.spanInputsmlflow.spanOutputsmlflow.spanInputsmlflow.spanOutputs