auditing-experiments-flags

Original：🇺🇸 English

Translated

Audit PostHog experiments and feature flags for configuration issues, staleness, and best-practice violations. Read when the user asks to audit, health-check, or review experiments or feature flags, check flag hygiene, or verify experiment setup.

8installs

Sourceposthog/ai-plugin

Added on2026-05-18

NPX Install

npx skill4agent add posthog/ai-plugin auditing-experiments-flags

SKILL.md Content

View Translation Comparison →

Auditing experiments and feature flags

This skill teaches you how to run configuration audits on experiments and feature flags. All checks use

read_data

and

list_data

— no SQL queries are needed for Phase 1 checks.

Usage modes

Quick check (single entity)

When the user asks about a specific experiment or flag:

Fetch the entity via

read_data

(e.g.,

read_data("experiments", id)

read_data("feature_flags", id)

Apply the relevant checks from experiment checks or flag checks.
Report findings inline as markdown, grouped by severity (CRITICAL first, then WARNING, then INFO).

Include entity links as

[Experiment: name](/experiments/id)

[Flag: key](/feature_flags/id)

Scoped audit (one domain)

When the user asks to audit all experiments or all flags:

Bulk-fetch via

list_data

(e.g.,

list_data("experiments")

list_data("feature_flags")

Run all checks for that domain against each entity.
Group findings by severity, then by entity.
Report as inline markdown.

Full audit (comprehensive)

When the user asks for a comprehensive audit of both experiments and flags:

Fetch all experiments via

list_data("experiments")

and all flags via

list_data("feature_flags")

Run all experiment checks and all flag checks.
Apply recurring patterns to identify patterns across multiple findings.
If there are more than 5 entities with findings, output as a notebook artifact via
```
create_notebook
```
for easier navigation. Otherwise report inline.

Output format

For each finding, include:

Severity badge:
```
🔴 CRITICAL
```
,
```
🟡 WARNING
```
, or
```
🔵 INFO
```
Check name: Which check produced this finding
Entity link: Markdown link to the entity
What's wrong: One-sentence description
Action: What to do about it (see remediation actions)

Example:

🟡 WARNING — Flag integration · Experiment: checkout-redesign The linked feature flag is inactive (paused). Traffic is not being split. Action: Re-enable the flag or end the experiment.

Handling unavailable data

Some checks require activity logs, which may not be available via

read_data

. If activity log data is unavailable:

Skip
```
checkActivityHistory
```
(experiment check) entirely.
Skip the "toggle instability" and "never activated" sub-checks in flag lifecycle checks.
In your report, note which checks were skipped and why:

Skipped: Activity history checks (activity logs not available via current tools)

Partial failures

If a

read_data

list_data

call fails for some entities:

Continue with the entities you could fetch.
Report which entities could not be assessed and why.
Do not silently omit entities from the audit.

Reference files

Experiment checks — experiment configuration checks
Flag checks — feature flag checks
Finding types — severity and category definitions
Recurring patterns — patterns across multiple findings
Remediation actions — what to do about each finding