oodle-metrics

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Oodle Metrics — Discovery and Querying

Oodle指标——发现与查询

This skill teaches the agent to find metric names, enumerate their labels, and build PromQL queries that return data on the first try.
本技能指导Agent查找指标名称、枚举其标签,并构建可一次性返回数据的PromQL查询语句。

Prerequisites

前置条件

bash
brew install oodle-ai/oodle/oodle
oodle configure
Confirm the metrics endpoint works:
bash
oodle metrics list --limit 5 -o json | jq 'length'
bash
brew install oodle-ai/oodle/oodle
oodle configure
确认指标端点正常工作:
bash
oodle metrics list --limit 5 -o json | jq 'length'

Command Execution Order

命令执行顺序

Before running any oodle command:
  1. Check whether the required metric name is already in context.
  2. If not, run
    oodle metrics list --match <prefix>
    to discover it.
  3. Run
    oodle metrics labels <name>
    to enumerate labels.
  4. Run
    oodle metrics label-values <name> <label>
    to enumerate the values you need to filter on.
  5. Build the PromQL query using only labels confirmed in step 3 and values confirmed in step 4.
运行任何oodle命令前:
  1. 检查所需指标名称是否已在上下文当中。
  2. 如果没有,执行
    oodle metrics list --match <prefix>
    来发现指标。
  3. 执行
    oodle metrics labels <name>
    枚举标签。
  4. 执行
    oodle metrics label-values <name> <label>
    枚举需要过滤的取值。
  5. 仅使用步骤3确认的标签和步骤4确认的取值构建PromQL查询语句。

Quick Reference

快速参考

TaskCommand
List metric names (filtered)
oodle metrics list --match "<prefix>" -o json
List metric names (paged)
oodle metrics list --limit 100 -o json
Get one metric's metadata
oodle metrics get <name> -o json
List labels for a metric
oodle metrics labels <name>
List values for a label
oodle metrics label-values <name> <label>
任务命令
(过滤后)列出指标名称
oodle metrics list --match "<prefix>" -o json
(分页)列出指标名称
oodle metrics list --limit 100 -o json
获取单个指标的元数据
oodle metrics get <name> -o json
列出指标的标签
oodle metrics labels <name>
列出标签的取值
oodle metrics label-values <name> <label>

Common Operations

常见操作

The 3-step label discovery workflow

三步标签发现工作流

Always run these three commands in order before writing a PromQL expression for a metric you have not used before.
bash
undefined
在为未使用过的指标编写PromQL表达式前,请务必按顺序执行以下三条命令。
bash
undefined

Step 1 — find the metric name

步骤1 — 查找指标名称

oodle metrics list --match "http_requests" -o json
oodle metrics list --match "http_requests" -o json

returns e.g. ["http_requests_total", "http_requests_in_flight"]

返回示例:["http_requests_total", "http_requests_in_flight"]

Step 2 — list available labels

步骤2 — 列出可用标签

oodle metrics labels http_requests_total
oodle metrics labels http_requests_total

returns e.g. ["service", "method", "status", "env"]

返回示例:["service", "method", "status", "env"]

Step 3 — list values for a label

步骤3 — 列出标签的取值

oodle metrics label-values http_requests_total service
oodle metrics label-values http_requests_total service

returns e.g. ["api", "checkout", "auth"]

返回示例:["api", "checkout", "auth"]


```bash

```bash

✅ CORRECT — query built from confirmed labels and values

✅ 正确 — 基于已确认的标签和取值构建查询

sum by (service) (rate(http_requests_total{service="api",env="prod",status=~"5.."}[5m]))
sum by (service) (rate(http_requests_total{service="api",env="prod",status=~"5.."}[5m]))

❌ WRONG — guessing label names; query returns no data

❌ 错误 — 猜测标签名称;查询无数据返回

sum by (svc) (rate(http_requests{app="api",environment="production",http_status=~"5.."}[5m]))
undefined
sum by (svc) (rate(http_requests{app="api",environment="production",http_status=~"5.."}[5m]))
undefined

Filtering and paging metric lists

指标列表的过滤与分页

bash
undefined
bash
undefined

✅ CORRECT — narrow with --match

✅ 正确 — 使用--match缩小范围

oodle metrics list --match "http_requests" -o json
oodle metrics list --match "http_requests" -o json

✅ CORRECT — page with --limit when sweeping a namespace

✅ 正确 — 遍历命名空间时使用--limit分页

oodle metrics list --match "kube_" --limit 200 -o json
oodle metrics list --match "kube_" --limit 200 -o json

❌ WRONG — listing every metric in the system, then grepping

❌ 错误 — 列出系统中所有指标后再用grep过滤

oodle metrics list -o json | jq '.[] | select(. | contains("http"))'
undefined
oodle metrics list -o json | jq '.[] | select(. | contains("http"))'
undefined

Cardinality-aware querying

基数感知查询

Before grouping by a label, confirm cardinality is bounded:
bash
undefined
按标签分组前,请确认基数是可控的:
bash
undefined

✅ CORRECT — confirm
service
has <100 values before grouping by it

✅ 正确 — 按service分组前确认其取值数量<100

oodle metrics label-values http_requests_total service | wc -l
oodle metrics label-values http_requests_total service | wc -l

❌ WRONG — grouping by a high-cardinality label like
request_id
melts the query

❌ 错误 — 按request_id这类高基数标签分组会导致查询崩溃

sum by (request_id) (rate(http_requests_total[5m]))
undefined
sum by (request_id) (rate(http_requests_total[5m]))
undefined

Best Practices

最佳实践

Always use
--match <prefix>
when listing metrics

列出指标时务必使用
--match <prefix>

The metrics namespace can be large; an unfiltered
oodle metrics list
is slow and noisy.
bash
undefined
指标命名空间可能很大;未过滤的
oodle metrics list
命令速度慢且输出冗余。
bash
undefined

✅ CORRECT

✅ 正确

oodle metrics list --match "http_requests" -o json
oodle metrics list --match "http_requests" -o json

❌ WRONG — returns thousands of results, may time out

❌ 错误 — 返回数千条结果,可能超时

oodle metrics list -o json
undefined
oodle metrics list -o json
undefined

Run the 3-step discovery workflow before writing PromQL

编写PromQL前执行三步发现工作流

Guessed label names produce queries that return no data and look like a metric is missing.
bash
undefined
猜测标签名称会导致查询无数据返回,看起来像是指标缺失。
bash
undefined

✅ CORRECT — labels confirmed by step 2, values confirmed by step 3

✅ 正确 — 标签经步骤2确认,取值经步骤3确认

oodle metrics labels http_requests_total oodle metrics label-values http_requests_total service sum by (service) (rate(http_requests_total{service="api"}[5m]))
oodle metrics labels http_requests_total oodle metrics label-values http_requests_total service sum by (service) (rate(http_requests_total{service="api"}[5m]))

❌ WRONG — writing the query first, then debugging "why is it empty?"

❌ 错误 — 先写查询再调试“为什么返回空?”

sum by (service_name) (rate(http_request_count{service_name="api"}[5m]))
undefined
sum by (service_name) (rate(http_request_count{service_name="api"}[5m]))
undefined

Always pipe to
jq
for scripting, never parse table output

脚本中始终通过管道输出到
jq
,绝不解析表格输出

Column ordering and widths in table output are not stable.
bash
undefined
表格输出的列顺序和宽度不稳定。
bash
undefined

✅ CORRECT

✅ 正确

oodle metrics list --match "http_" -o json | jq -r '.[]'
oodle metrics list --match "http_" -o json | jq -r '.[]'

❌ WRONG

❌ 错误

oodle metrics list --match "http_" | tail -n +2 | awk '{print $1}'
undefined
oodle metrics list --match "http_" | tail -n +2 | awk '{print $1}'
undefined

Prefix every counter rate with
rate(...[5m])
not
rate(...[1m])

所有计数器速率前缀使用
rate(...[5m])
而非
rate(...[1m])

[1m]
rates are noisy on low-volume series and don't smooth across scrape gaps.
bash
undefined
[1m]
速率在低流量序列上会产生噪声,且无法平滑处理抓取间隙。
bash
undefined

✅ CORRECT

✅ 正确

sum by (service) (rate(http_requests_total[5m]))
sum by (service) (rate(http_requests_total[5m]))

❌ WRONG — flapping graphs, false alerts when a single scrape is missed

❌ 错误 — 图表波动大,单次抓取丢失时会触发误告警

sum by (service) (rate(http_requests_total[1m]))
undefined
sum by (service) (rate(http_requests_total[1m]))
undefined

Failure Handling

故障处理

ErrorCauseFix
401 UnauthorizedInvalid or missing API keyRun
oodle configure
or set
OODLE_API_KEY
404 Not FoundMetric name does not existRun
oodle metrics list --match <prefix>
to find the correct name
connection refusedWrong
OODLE_DEPLOYMENT
URL
Check
OODLE_DEPLOYMENT
env var
Empty result from a queryWrong label name or wrong label valueRe-run step 2 (
labels
) and step 3 (
label-values
); fix the selector
Query timeoutCardinality too high (e.g.
by (request_id)
)
Drop high-cardinality labels from
by (...)
; add a tighter time window
parse error
Broken PromQL syntaxValidate the expression in the UI metrics explorer first
429 Too Many RequestsHeavy concurrent label-values callsAdd
--retries 3
; cache label-values output in scripts
错误原因修复方案
401 UnauthorizedAPI密钥无效或缺失执行
oodle configure
或设置
OODLE_API_KEY
环境变量
404 Not Found指标名称不存在执行
oodle metrics list --match <prefix>
查找正确名称
connection refused
OODLE_DEPLOYMENT
URL错误
检查
OODLE_DEPLOYMENT
环境变量
查询返回空结果标签名称错误或标签取值错误重新执行步骤2(
labels
)和步骤3(
label-values
);修正选择器
查询超时基数过高(如
by (request_id)
by (...)
中移除高基数标签;缩小时间窗口范围
parse error
PromQL语法错误先在UI指标探索器中验证表达式
429 Too Many Requests并发调用label-values过于频繁添加
--retries 3
参数;在脚本中缓存label-values输出

References

参考资料