auditing-experiments-flags

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Auditing experiments and feature flags

审计实验与功能标志

This skill teaches you how to run configuration audits on experiments and feature flags. All checks use

read_data

and

list_data

— no SQL queries are needed for Phase 1 checks.

本技能将指导你如何对实验与功能标志进行配置审计。所有检查均使用

read_data

和

list_data

——第一阶段检查无需SQL查询。

Usage modes

使用模式

Quick check (single entity)

快速检查（单个实体）

When the user asks about a specific experiment or flag:

Fetch the entity via

read_data

(e.g.,

read_data("experiments", id)

read_data("feature_flags", id)

Apply the relevant checks from experiment checks or flag checks.
Report findings inline as markdown, grouped by severity (CRITICAL first, then WARNING, then INFO).

Include entity links as

[Experiment: name](/experiments/id)

[Flag: key](/feature_flags/id)

当用户询问特定实验或标志时：

通过

read_data

获取实体（例如：

read_data("experiments", id)

或

read_data("feature_flags", id)

）。

应用实验检查或标志检查中的相关检查项。
按严重程度（先CRITICAL，再WARNING，最后INFO）分组，以Markdown格式内联报告发现结果。

包含实体链接，格式为

[Experiment: name](/experiments/id)

或

[Flag: key](/feature_flags/id)

。

Scoped audit (one domain)

范围审计（单个领域）

When the user asks to audit all experiments or all flags:

Bulk-fetch via

list_data

(e.g.,

list_data("experiments")

list_data("feature_flags")

Run all checks for that domain against each entity.
Group findings by severity, then by entity.
Report as inline markdown.

当用户要求审计所有实验或所有标志时：

通过

list_data

批量获取（例如：

list_data("experiments")

或

list_data("feature_flags")

）。

针对每个实体运行该领域的所有检查项。
先按严重程度分组，再按实体分组整理发现结果。
以Markdown格式内联报告。

Full audit (comprehensive)

全面审计（综合型）

When the user asks for a comprehensive audit of both experiments and flags:

Fetch all experiments via

list_data("experiments")

and all flags via

list_data("feature_flags")

Run all experiment checks and all flag checks.
Apply recurring patterns to identify patterns across multiple findings.
If there are more than 5 entities with findings, output as a notebook artifact via
```
create_notebook
```
for easier navigation. Otherwise report inline.

当用户要求对实验和标志进行全面审计时：

通过
```
list_data("experiments")
```
获取所有实验，通过
```
list_data("feature_flags")
```
获取所有标志。
运行所有实验检查项和标志检查项。
应用重复模式识别多个发现中的共性规律。
如果有超过5个实体存在问题，通过
```
create_notebook
```
输出为Notebook工件以便于导航；否则内联报告。

Output format

输出格式

For each finding, include:

Severity badge:
```
🔴 CRITICAL
```
,
```
🟡 WARNING
```
, or
```
🔵 INFO
```
Check name: Which check produced this finding
Entity link: Markdown link to the entity
What's wrong: One-sentence description
Action: What to do about it (see remediation actions)

Example:

🟡 WARNING — Flag integration · Experiment: checkout-redesign The linked feature flag is inactive (paused). Traffic is not being split. Action: Re-enable the flag or end the experiment.

每条发现需包含以下内容：

严重程度标识：
```
🔴 CRITICAL
```
、
```
🟡 WARNING
```
或
```
🔵 INFO
```
检查名称：生成该发现的检查项名称
实体链接：指向实体的Markdown链接
问题描述：一句话说明问题
整改措施：对应的解决方法（详见整改措施）

示例：

🟡 WARNING — Flag integration · Experiment: checkout-redesign 关联的功能标志处于非活跃状态（已暂停），流量未被拆分。 Action：重新启用标志或终止实验。

Handling unavailable data

处理不可用数据

Some checks require activity logs, which may not be available via

read_data

. If activity log data is unavailable:

Skip
```
checkActivityHistory
```
(experiment check) entirely.
Skip the "toggle instability" and "never activated" sub-checks in flag lifecycle checks.
In your report, note which checks were skipped and why:

Skipped: Activity history checks (activity logs not available via current tools)

部分检查需要活动日志，而这些日志可能无法通过

read_data

获取。如果活动日志数据不可用：

完全跳过
```
checkActivityHistory
```
（实验检查项）。
跳过标志生命周期检查中的“切换不稳定性”和“从未激活”子检查项。
在报告中注明跳过的检查项及原因：

已跳过：活动历史检查（当前工具无法获取活动日志）

Partial failures

部分失败情况

If a

read_data

list_data

call fails for some entities:

Continue with the entities you could fetch.
Report which entities could not be assessed and why.
Do not silently omit entities from the audit.

如果某些实体的

read_data

或

list_data

调用失败：

继续处理已成功获取的实体。
报告无法评估的实体及其原因。
不得在审计中静默遗漏实体。

Reference files

参考文件

Experiment checks — experiment configuration checks
Flag checks — feature flag checks
Finding types — severity and category definitions
Recurring patterns — patterns across multiple findings
Remediation actions — what to do about each finding

Experiment checks — 实验配置检查项
Flag checks — 功能标志检查项
Finding types — 严重程度与类别定义
Recurring patterns — 多发现中的共性规律
Remediation actions — 各发现对应的整改措施