querying-mlflow-metrics
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMLflow Metrics
MLflow 指标
Run to query metrics from an MLflow tracking server.
scripts/fetch_metrics.py运行从MLflow跟踪服务器查询指标。
scripts/fetch_metrics.pyExamples
示例
Token usage summary:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m total_tokens -a SUM,AVGOutput:
AVG: 223.91 SUM: 7613Hourly token trend (last 24h):
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m total_tokens -a SUM \
-t 3600 --start-time="-24h" --end-time=nowOutput: Time-bucketed token sums per hour
Latency percentiles by trace:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m latency -a AVG,P95 -d trace_nameError rate by status:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m trace_count -a COUNT -d trace_statusQuality scores by evaluator (assessments):
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -v ASSESSMENTS \
-m assessment_value -a AVG,P50 -d assessment_nameOutput: Average and median scores for each evaluator (e.g., correctness, relevance)
Assessment count by name:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -v ASSESSMENTS \
-m assessment_count -a COUNT -d assessment_nameJSON output: Add to any command.
-o json令牌使用情况汇总:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m total_tokens -a SUM,AVG输出:
AVG: 223.91 SUM: 7613每小时令牌使用趋势(过去24小时):
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m total_tokens -a SUM \
-t 3600 --start-time="-24h" --end-time=now输出: 按小时划分的令牌使用量总和
按跟踪项统计的延迟百分位数:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m latency -a AVG,P95 -d trace_name按状态统计的错误率:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -m trace_count -a COUNT -d trace_status按评估者统计的质量分数(评估项):
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -v ASSESSMENTS \
-m assessment_value -a AVG,P50 -d assessment_name输出: 每个评估者的平均分数和中位数分数(例如:正确性、相关性)
按名称统计的评估项数量:
bash
python scripts/fetch_metrics.py -s http://localhost:5000 -x 1 -v ASSESSMENTS \
-m assessment_count -a COUNT -d assessment_nameJSON输出: 在任意命令后添加即可。
-o jsonArguments
参数
| Arg | Required | Description |
|---|---|---|
| Yes | MLflow server URL |
| Yes | Experiment IDs (comma-separated) |
| Yes | |
| Yes | |
| No | Group by: |
| No | Bucket size in seconds (3600=hourly, 86400=daily) |
| No | |
| No | Same formats as start-time |
| No | |
For SPANS metrics (, ), add .
For ASSESSMENTS metrics, add .
span_countlatency-v SPANS-v ASSESSMENTSSee references/api_reference.md for filter syntax and full API details.
| 参数 | 是否必填 | 描述 |
|---|---|---|
| 是 | MLflow服务器URL |
| 是 | 实验ID(逗号分隔) |
| 是 | 可选值: |
| 是 | 可选值: |
| 否 | 分组依据: |
| 否 | 时间桶大小(单位:秒,3600=每小时,86400=每天) |
| 否 | 可选格式: |
| 否 | 与开始时间格式相同 |
| 否 | 输出格式: |
若要获取SPANS指标(、),请添加参数。
若要获取ASSESSMENTS指标,请添加参数。
span_countlatency-v SPANS-v ASSESSMENTS有关过滤语法和完整API详情,请查看references/api_reference.md。