Loading...
Loading...
Found 7 Skills
Use when experiments complete to judge what claims the results support, what they don't, and what evidence is still missing. Codex MCP evaluates results against intended claims and routes to next action (pivot, supplement, or confirm). Use after experiments finish — before writing the paper or running ablations.
Resolves experiment references from natural language to concrete experiment IDs. Handles name lookups, fuzzy descriptions ('the signup experiment', 'my latest experiment'), status filtering, and disambiguation when multiple experiments match. TRIGGER when: user refers to an experiment by name, description, or relative reference ('latest', 'most recent', 'the one I created yesterday') and you don't already have the experiment ID. DO NOT TRIGGER when: user provides an experiment ID directly, or you already resolved the experiment earlier in the conversation.
Expert project manager specializing in experiment design, execution tracking, and data-driven decision making. Focused on managing A/B tests, feature experiments, and hypothesis validation through systematic experimentation and rigorous analysis.
Decide where files live in an ML experimentation project: reusable code in `src/<pkg>/`, one `# %%` script per experiment in `experiments/`, design notes + index in `journal/`, reports in `reports/`, agent-only probes in `scratch/`, narrative digest in `overview/summary.md`. Owns the layout, the file-creation rules (one file per experiment, ask before editing), and the jupytext `# %%` script convention. Never imposes `data/` — the user owns that. TRIGGER — any of: - Starting a new ML project / scaffolding a workspace. - About to create the first experiment file in a project. - About to create `src/<pkg>/data.py` / `features.py` / `pipeline.py` / `evaluate.py` for the first time. - About to write a `.ipynb` for experimentation — redirect to a `# %%` script under `experiments/`. - User asks where something should live, how to organize the project, or how to set up the workspace. - About to add a new experiment iteration — decide new file vs edit existing (ask the user). SKIP when: the file is clearly part of an already-populated module (e.g., adding a function to existing `features.py`); pure refactor inside a single existing file; pipeline declaration mechanics (`build-ml-pipeline`); evaluation mechanics (`evaluate-ml-pipeline`); skore symbol lookup (`python-api`). HOW TO USE: **first run the Detection table** below — if any signal matches, glue to existing conventions (do not rename or move folders). If no signal matches, scaffold the default layout. **Emit the Pre-flight checklist as visible text and read the Stop conditions before any file is created or edited.** Use templates in `templates/`; copy and adapt, do not rewrite from scratch.
Curated repository of experiment hypotheses, assumptions, and historical learnings.
10 research automation skills. Trigger: automating experiments, tracking results, reproducible pipelines. Design: ML experiment management, workflow orchestration, and lab automation tools.
Baidu FaMou algorithm skills for efficient algorithm self-evolution. Provides experiment management and visualization capabilities to help optimize complex algorithms. Use when user needs algorithm optimization or experiment management.