oodle-drop-rules

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Oodle Drop Rules — Cost Control

Oodle 丢弃规则——成本控制

This skill teaches the agent to drop or sample high-volume metrics safely: estimate the impact first, prefer sampling over dropping, and never silently delete a metric a dashboard depends on.
本技能指导Agent安全地丢弃或采样高容量指标:先评估影响,优先选择采样而非丢弃,绝不静默删除仪表板依赖的指标。

Prerequisites

前提条件

bash
brew install oodle-ai/oodle/oodle
oodle configure
Confirm the drop-rules endpoint works:
bash
oodle drop-rules list -o json | jq 'length'
bash
brew install oodle-ai/oodle/oodle
oodle configure
确认丢弃规则端点可用:
bash
oodle drop-rules list -o json | jq 'length'

Command Execution Order

命令执行顺序

Before running any oodle command:
  1. Check whether the target metric prefix and matchers are already in context.
  2. If not, run
    oodle metrics list --match <prefix> -o json | jq 'length'
    to estimate impact.
  3. Confirm no critical dashboards or monitors depend on the affected series:
    • oodle dashboards list -o json | jq '.[] | select(.panels[]?.query | contains("<metric>"))'
    • oodle monitors list -o json | jq '.[] | select(.query | contains("<metric>"))'
  4. Prefer
    action: sample
    (e.g.
    sampleRate: 0.1
    ) on the first attempt; switch to
    action: drop
    only after the sampled data confirms low value.
  5. Run
    oodle drop-rules create -f rule.json
    .
在运行任何oodle命令之前:
  1. 检查目标指标前缀和匹配器是否已在上下文当中。
  2. 如果没有,运行
    oodle metrics list --match <prefix> -o json | jq 'length'
    来评估影响范围。
  3. 确认没有关键仪表板或监控依赖受影响的序列:
    • oodle dashboards list -o json | jq '.[] | select(.panels[]?.query | contains("<metric>"))'
    • oodle monitors list -o json | jq '.[] | select(.query | contains("<metric>"))'
  4. 首次尝试优先选择
    action: sample
    (例如
    sampleRate: 0.1
    );只有在采样数据确认指标价值较低后,才切换为
    action: drop
  5. 运行
    oodle drop-rules create -f rule.json

Quick Reference

快速参考

TaskCommand
List rules
oodle drop-rules list -o json
Get rule
oodle drop-rules get <id> -o json
Create rule
oodle drop-rules create -f rule.json
Update rule
oodle drop-rules update <id> -f rule.json
Delete rule
oodle drop-rules delete <id> --force
任务命令
列出规则
oodle drop-rules list -o json
获取规则
oodle drop-rules get <id> -o json
创建规则
oodle drop-rules create -f rule.json
更新规则
oodle drop-rules update <id> -f rule.json
删除规则
oodle drop-rules delete <id> --force

Common Operations

常见操作

Rule schema

规则 schema

json
{
  "name": "drop-noisy-debug-metrics",
  "matchers": [
    {"name": "level", "value": "debug"},
    {"name": "env",   "value": "staging"}
  ],
  "action": "drop",
  "sampleRate": null
}
FieldMeaning
matchers
All matchers must match for the rule to fire (logical AND)
action
drop
(discard) or
sample
(keep a fraction)
sampleRate
Required when
action="sample"
; float in
(0,1]
.
null
for
drop
json
{
  "name": "drop-noisy-debug-metrics",
  "matchers": [
    {"name": "level", "value": "debug"},
    {"name": "env",   "value": "staging"}
  ],
  "action": "drop",
  "sampleRate": null
}
字段含义
matchers
所有匹配器必须全部匹配,规则才会生效(逻辑与)
action
drop
(丢弃)或
sample
(保留部分数据)
sampleRate
action="sample"
时为必填项;取值为
(0,1]
范围内的浮点数。
drop
时设为
null

Estimating impact before creating a rule

创建规则前评估影响范围

bash
undefined
bash
undefined

✅ CORRECT — count series that will be affected

✅ 正确做法 — 统计将受影响的序列数量

oodle metrics list --match "debug_" -o json | jq 'length'
oodle metrics list --match "debug_" -o json | jq 'length'

✅ CORRECT — confirm no dashboard panel queries the metric

✅ 正确做法 — 确认没有仪表板面板查询该指标

oodle dashboards list -o json | jq '.[] | select(.panels[]?.query | contains("debug_traffic_total"))'
oodle dashboards list -o json | jq '.[] | select(.panels[]?.query | contains("debug_traffic_total"))'

✅ CORRECT — confirm no monitor depends on it

✅ 正确做法 — 确认没有监控依赖该指标

oodle monitors list -o json | jq '.[] | select(.query | contains("debug_traffic_total"))'
oodle monitors list -o json | jq '.[] | select(.query | contains("debug_traffic_total"))'

❌ WRONG — create the rule first, find out from on-call later

❌ 错误做法 — 先创建规则,之后从值班人员那里发现问题

oodle drop-rules create -f rule.json
undefined
oodle drop-rules create -f rule.json
undefined

Creating a
sample
rule (safer first step)

创建
sample
规则(更安全的第一步)

json
{
  "name": "sample-debug-metrics-staging",
  "matchers": [
    {"name": "__name__", "value": "debug_traffic_total"},
    {"name": "env",      "value": "staging"}
  ],
  "action": "sample",
  "sampleRate": 0.1
}
bash
undefined
json
{
  "name": "sample-debug-metrics-staging",
  "matchers": [
    {"name": "__name__", "value": "debug_traffic_total"},
    {"name": "env",      "value": "staging"}
  ],
  "action": "sample",
  "sampleRate": 0.1
}
bash
undefined

✅ CORRECT — keep 10% of the series; observe for a week before dropping

✅ 正确做法 — 保留10%的序列;观察一周后再考虑丢弃

oodle drop-rules create -f rule.json
undefined
oodle drop-rules create -f rule.json
undefined

Promoting a
sample
rule to
drop

sample
规则升级为
drop
规则

bash
undefined
bash
undefined

✅ CORRECT — get → switch action → update

✅ 正确做法 — 获取规则 → 切换动作 → 更新规则

oodle drop-rules get dr_123 -o json > rule.json jq '.action = "drop" | .sampleRate = null' rule.json > rule.new.json oodle drop-rules update dr_123 -f rule.new.json
oodle drop-rules get dr_123 -o json > rule.json jq '.action = "drop" | .sampleRate = null' rule.json > rule.new.json oodle drop-rules update dr_123 -f rule.new.json

❌ WRONG — partial payload nulls matchers

❌ 错误做法 — 部分负载会清空匹配器

oodle drop-rules update dr_123 -f <(echo '{"action":"drop"}')
undefined
oodle drop-rules update dr_123 -f <(echo '{"action":"drop"}')
undefined

Deleting (re-enabling ingestion)

删除规则(重新启用数据摄入)

bash
undefined
bash
undefined

✅ CORRECT

✅ 正确做法

oodle drop-rules get dr_123 -o json > /dev/null oodle drop-rules delete dr_123 --force
oodle drop-rules get dr_123 -o json > /dev/null oodle drop-rules delete dr_123 --force

❌ WRONG — speculative delete by name match

❌ 错误做法 — 通过名称匹配推测性删除

oodle drop-rules delete "$(oodle drop-rules list | grep debug | awk '{print $1}')" --force
undefined
oodle drop-rules delete "$(oodle drop-rules list | grep debug | awk '{print $1}')" --force
undefined

Best Practices

最佳实践

Estimate impact with
oodle metrics list --match <prefix> -o json | jq 'length'
before creating a rule

创建规则前,使用
oodle metrics list --match <prefix> -o json | jq 'length'
评估影响范围

A drop rule that matches more series than expected can hide real signal.
bash
undefined
匹配超出预期数量序列的丢弃规则可能会掩盖真实信号。
bash
undefined

✅ CORRECT

✅ 正确做法

oodle metrics list --match "kube_pod_" -o json | jq 'length'
oodle metrics list --match "kube_pod_" -o json | jq 'length'

1742 series — confirm with the team that all 1742 are safe to drop before creating a rule

1742个序列 — 创建规则前,与团队确认所有1742个序列都可以安全丢弃

❌ WRONG — create rule based on a guess; later discover a critical metric was matched

❌ 错误做法 — 凭猜测创建规则;之后发现关键指标被匹配

oodle drop-rules create -f rule.json
undefined
oodle drop-rules create -f rule.json
undefined

Prefer
action: sample
with
sampleRate: 0.1
on the first iteration

首次迭代优先选择
action: sample
并设置
sampleRate: 0.1

Sampling preserves enough signal to confirm the metric truly is low-value before fully dropping it.
bash
undefined
采样会保留足够的信号,以便在完全丢弃前确认指标确实价值较低。
bash
undefined

✅ CORRECT — week 1: sample at 10%, observe dashboards

✅ 正确做法 — 第一周:以10%比例采样,观察仪表板

"action": "sample", "sampleRate": 0.1
"action": "sample", "sampleRate": 0.1

week 2: if no dashboards or monitors regressed, switch to drop

第二周:如果没有仪表板或监控出现异常,切换为丢弃

"action": "drop", "sampleRate": null
"action": "drop", "sampleRate": null

❌ WRONG — drop on first attempt; can break a dashboard nobody remembered

❌ 错误做法 — 首次尝试就丢弃;可能会破坏无人记得的仪表板

"action": "drop"
undefined
"action": "drop"
undefined

Always include both
env
and
service
(or
__name__
) in matchers

匹配器中始终同时包含
env
service
(或
__name__

Broad matchers like
{level: debug}
alone can match production metrics that happen to share a label.
bash
undefined
仅使用
{level: debug}
这类宽泛的匹配器,可能会匹配到恰好共享标签的生产环境指标。
bash
undefined

✅ CORRECT — scoped to one env

✅ 正确做法 — 限定在单个环境

"matchers": [{"name":"level","value":"debug"},{"name":"env","value":"staging"}]
"matchers": [{"name":"level","value":"debug"},{"name":"env","value":"staging"}]

❌ WRONG — also drops debug metrics in prod

❌ 错误做法 — 同时丢弃生产环境的调试指标

"matchers": [{"name":"level","value":"debug"}]
undefined
"matchers": [{"name":"level","value":"debug"}]
undefined

Always
get
before
update
to preserve fields

更新前始终先
get
规则以保留字段

Update is a full-document replace.
bash
undefined
更新操作是全文档替换。
bash
undefined

✅ CORRECT

✅ 正确做法

oodle drop-rules get dr_123 -o json > rule.json jq '.sampleRate = 0.05' rule.json > rule.new.json oodle drop-rules update dr_123 -f rule.new.json
oodle drop-rules get dr_123 -o json > rule.json jq '.sampleRate = 0.05' rule.json > rule.new.json oodle drop-rules update dr_123 -f rule.new.json

❌ WRONG — drops matchers and action

❌ 错误做法 — 会丢失匹配器和动作

oodle drop-rules update dr_123 -f <(echo '{"sampleRate":0.05}')
undefined
oodle drop-rules update dr_123 -f <(echo '{"sampleRate":0.05}')
undefined

Name rules with
<action>-<metric-or-domain>-<scope>

规则命名采用
<action>-<metric-or-domain>-<scope>
格式

Predictable names make it easy to find and revert a rule when a dashboard breaks.
bash
undefined
可预测的命名便于在仪表板出现问题时快速查找和恢复规则。
bash
undefined

✅ CORRECT

✅ 正确做法

"name": "drop-debug-metrics-staging" "name": "sample-kube-pod-info-prod"
"name": "drop-debug-metrics-staging" "name": "sample-kube-pod-info-prod"

❌ WRONG

❌ 错误做法

"name": "rule1"
undefined
"name": "rule1"
undefined

Failure Handling

故障处理

ErrorCauseFix
401 UnauthorizedInvalid or missing API keyRun
oodle configure
or set
OODLE_API_KEY
404 Not FoundDrop rule ID does not existVerify with
oodle drop-rules list -o json
connection refusedWrong
OODLE_DEPLOYMENT
URL
Check
OODLE_DEPLOYMENT
env var
sampleRate required
action: sample
without
sampleRate
Add
sampleRate
between 0 and 1 (e.g. 0.1)
Dashboard panel suddenly emptyDrop rule matched a metric the panel queriesRun
oodle drop-rules list -o json
to find the rule; delete it (
oodle drop-rules delete <id> --force
) or narrow its matchers
Monitor went into
no data
Drop rule matched the monitor's metricSame fix as above; alternatively switch the rule from
drop
to
sample
Cost did not decreaseMatchers don't actually match the high-volume seriesRe-run
oodle metrics list --match ...
and compare label sets to the rule's matchers
429 Too Many RequestsBulk drop-rule syncAdd
--retries 3
, throttle to <10 creates per second
错误原因修复方案
401 UnauthorizedAPI密钥无效或缺失运行
oodle configure
或设置
OODLE_API_KEY
环境变量
404 Not Found丢弃规则ID不存在使用
oodle drop-rules list -o json
验证
connection refused
OODLE_DEPLOYMENT
URL错误
检查
OODLE_DEPLOYMENT
环境变量
sampleRate required
设置了
action: sample
但未指定
sampleRate
添加取值在0到1之间的
sampleRate
(例如0.1)
仪表板面板突然为空丢弃规则匹配了面板查询的指标运行
oodle drop-rules list -o json
查找规则;删除规则(
oodle drop-rules delete <id> --force
)或缩小匹配器范围
监控进入
no data
状态
丢弃规则匹配了监控的指标修复方案同上;或者将规则从
drop
切换为
sample
成本未降低匹配器未实际匹配高容量序列重新运行
oodle metrics list --match ...
并将标签集与规则的匹配器进行对比
429 Too Many Requests批量同步丢弃规则添加
--retries 3
参数,将速率限制为每秒少于10次创建操作

References

参考资料