oodle-metrics
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOodle Metrics — Discovery and Querying
Oodle指标——发现与查询
This skill teaches the agent to find metric names, enumerate their labels, and build PromQL queries that return data on the first try.
本技能指导Agent查找指标名称、枚举其标签,并构建可一次性返回数据的PromQL查询语句。
Prerequisites
前置条件
bash
brew install oodle-ai/oodle/oodle
oodle configureConfirm the metrics endpoint works:
bash
oodle metrics list --limit 5 -o json | jq 'length'bash
brew install oodle-ai/oodle/oodle
oodle configure确认指标端点正常工作:
bash
oodle metrics list --limit 5 -o json | jq 'length'Command Execution Order
命令执行顺序
Before running any oodle command:
- Check whether the required metric name is already in context.
- If not, run to discover it.
oodle metrics list --match <prefix> - Run to enumerate labels.
oodle metrics labels <name> - Run to enumerate the values you need to filter on.
oodle metrics label-values <name> <label> - Build the PromQL query using only labels confirmed in step 3 and values confirmed in step 4.
运行任何oodle命令前:
- 检查所需指标名称是否已在上下文当中。
- 如果没有,执行来发现指标。
oodle metrics list --match <prefix> - 执行枚举标签。
oodle metrics labels <name> - 执行枚举需要过滤的取值。
oodle metrics label-values <name> <label> - 仅使用步骤3确认的标签和步骤4确认的取值构建PromQL查询语句。
Quick Reference
快速参考
| Task | Command |
|---|---|
| List metric names (filtered) | |
| List metric names (paged) | |
| Get one metric's metadata | |
| List labels for a metric | |
| List values for a label | |
| 任务 | 命令 |
|---|---|
| (过滤后)列出指标名称 | |
| (分页)列出指标名称 | |
| 获取单个指标的元数据 | |
| 列出指标的标签 | |
| 列出标签的取值 | |
Common Operations
常见操作
The 3-step label discovery workflow
三步标签发现工作流
Always run these three commands in order before writing a PromQL expression for a metric you have not used before.
bash
undefined在为未使用过的指标编写PromQL表达式前,请务必按顺序执行以下三条命令。
bash
undefinedStep 1 — find the metric name
步骤1 — 查找指标名称
oodle metrics list --match "http_requests" -o json
oodle metrics list --match "http_requests" -o json
returns e.g. ["http_requests_total", "http_requests_in_flight"]
返回示例:["http_requests_total", "http_requests_in_flight"]
Step 2 — list available labels
步骤2 — 列出可用标签
oodle metrics labels http_requests_total
oodle metrics labels http_requests_total
returns e.g. ["service", "method", "status", "env"]
返回示例:["service", "method", "status", "env"]
Step 3 — list values for a label
步骤3 — 列出标签的取值
oodle metrics label-values http_requests_total service
oodle metrics label-values http_requests_total service
returns e.g. ["api", "checkout", "auth"]
返回示例:["api", "checkout", "auth"]
```bash
```bash✅ CORRECT — query built from confirmed labels and values
✅ 正确 — 基于已确认的标签和取值构建查询
sum by (service) (rate(http_requests_total{service="api",env="prod",status=~"5.."}[5m]))
sum by (service) (rate(http_requests_total{service="api",env="prod",status=~"5.."}[5m]))
❌ WRONG — guessing label names; query returns no data
❌ 错误 — 猜测标签名称;查询无数据返回
sum by (svc) (rate(http_requests{app="api",environment="production",http_status=~"5.."}[5m]))
undefinedsum by (svc) (rate(http_requests{app="api",environment="production",http_status=~"5.."}[5m]))
undefinedFiltering and paging metric lists
指标列表的过滤与分页
bash
undefinedbash
undefined✅ CORRECT — narrow with --match
✅ 正确 — 使用--match缩小范围
oodle metrics list --match "http_requests" -o json
oodle metrics list --match "http_requests" -o json
✅ CORRECT — page with --limit when sweeping a namespace
✅ 正确 — 遍历命名空间时使用--limit分页
oodle metrics list --match "kube_" --limit 200 -o json
oodle metrics list --match "kube_" --limit 200 -o json
❌ WRONG — listing every metric in the system, then grepping
❌ 错误 — 列出系统中所有指标后再用grep过滤
oodle metrics list -o json | jq '.[] | select(. | contains("http"))'
undefinedoodle metrics list -o json | jq '.[] | select(. | contains("http"))'
undefinedCardinality-aware querying
基数感知查询
Before grouping by a label, confirm cardinality is bounded:
bash
undefined按标签分组前,请确认基数是可控的:
bash
undefined✅ CORRECT — confirm service
has <100 values before grouping by it
service✅ 正确 — 按service分组前确认其取值数量<100
oodle metrics label-values http_requests_total service | wc -l
oodle metrics label-values http_requests_total service | wc -l
❌ WRONG — grouping by a high-cardinality label like request_id
melts the query
request_id❌ 错误 — 按request_id这类高基数标签分组会导致查询崩溃
sum by (request_id) (rate(http_requests_total[5m]))
undefinedsum by (request_id) (rate(http_requests_total[5m]))
undefinedBest Practices
最佳实践
Always use --match <prefix>
when listing metrics
--match <prefix>列出指标时务必使用--match <prefix>
--match <prefix>The metrics namespace can be large; an unfiltered is slow and noisy.
oodle metrics listbash
undefined指标命名空间可能很大;未过滤的命令速度慢且输出冗余。
oodle metrics listbash
undefined✅ CORRECT
✅ 正确
oodle metrics list --match "http_requests" -o json
oodle metrics list --match "http_requests" -o json
❌ WRONG — returns thousands of results, may time out
❌ 错误 — 返回数千条结果,可能超时
oodle metrics list -o json
undefinedoodle metrics list -o json
undefinedRun the 3-step discovery workflow before writing PromQL
编写PromQL前执行三步发现工作流
Guessed label names produce queries that return no data and look like a metric is missing.
bash
undefined猜测标签名称会导致查询无数据返回,看起来像是指标缺失。
bash
undefined✅ CORRECT — labels confirmed by step 2, values confirmed by step 3
✅ 正确 — 标签经步骤2确认,取值经步骤3确认
oodle metrics labels http_requests_total
oodle metrics label-values http_requests_total service
sum by (service) (rate(http_requests_total{service="api"}[5m]))
oodle metrics labels http_requests_total
oodle metrics label-values http_requests_total service
sum by (service) (rate(http_requests_total{service="api"}[5m]))
❌ WRONG — writing the query first, then debugging "why is it empty?"
❌ 错误 — 先写查询再调试“为什么返回空?”
sum by (service_name) (rate(http_request_count{service_name="api"}[5m]))
undefinedsum by (service_name) (rate(http_request_count{service_name="api"}[5m]))
undefinedAlways pipe to jq
for scripting, never parse table output
jq脚本中始终通过管道输出到jq
,绝不解析表格输出
jqColumn ordering and widths in table output are not stable.
bash
undefined表格输出的列顺序和宽度不稳定。
bash
undefined✅ CORRECT
✅ 正确
oodle metrics list --match "http_" -o json | jq -r '.[]'
oodle metrics list --match "http_" -o json | jq -r '.[]'
❌ WRONG
❌ 错误
oodle metrics list --match "http_" | tail -n +2 | awk '{print $1}'
undefinedoodle metrics list --match "http_" | tail -n +2 | awk '{print $1}'
undefinedPrefix every counter rate with rate(...[5m])
not rate(...[1m])
rate(...[5m])rate(...[1m])所有计数器速率前缀使用rate(...[5m])
而非rate(...[1m])
rate(...[5m])rate(...[1m])[1m]bash
undefined[1m]bash
undefined✅ CORRECT
✅ 正确
sum by (service) (rate(http_requests_total[5m]))
sum by (service) (rate(http_requests_total[5m]))
❌ WRONG — flapping graphs, false alerts when a single scrape is missed
❌ 错误 — 图表波动大,单次抓取丢失时会触发误告警
sum by (service) (rate(http_requests_total[1m]))
undefinedsum by (service) (rate(http_requests_total[1m]))
undefinedFailure Handling
故障处理
| Error | Cause | Fix |
|---|---|---|
| 401 Unauthorized | Invalid or missing API key | Run |
| 404 Not Found | Metric name does not exist | Run |
| connection refused | Wrong | Check |
| Empty result from a query | Wrong label name or wrong label value | Re-run step 2 ( |
| Query timeout | Cardinality too high (e.g. | Drop high-cardinality labels from |
| Broken PromQL syntax | Validate the expression in the UI metrics explorer first |
| 429 Too Many Requests | Heavy concurrent label-values calls | Add |
| 错误 | 原因 | 修复方案 |
|---|---|---|
| 401 Unauthorized | API密钥无效或缺失 | 执行 |
| 404 Not Found | 指标名称不存在 | 执行 |
| connection refused | | 检查 |
| 查询返回空结果 | 标签名称错误或标签取值错误 | 重新执行步骤2( |
| 查询超时 | 基数过高(如 | 从 |
| PromQL语法错误 | 先在UI指标探索器中验证表达式 |
| 429 Too Many Requests | 并发调用label-values过于频繁 | 添加 |