Loading...
Loading...
Found 26 Skills
Set up and run an autonomous experiment loop for any optimization target. Use when asked to start autoresearch or run experiments.
Run Karpathy-style autoresearch optimization on any content. Generates 50+ variants, scores with a 5-expert simulated panel, evolves winners through multiple rounds, outputs optimized version + full experiment log. Use when optimizing landing pages, email sequences, ad copy, headlines, form pages, CTA text, or any conversion-focused content. Triggers on "optimize this page", "run autoresearch", "score these variants", "A/B test this copy".
Autonomous experiment loop that optimizes any file by a measurable metric. Inspired by Karpathy's autoresearch. The agent edits a target file, runs a fixed evaluation, keeps improvements (git commit), discards failures (git reset), and loops indefinitely. Use when: user wants to optimize code speed, reduce bundle/image size, improve test pass rate, optimize prompts, improve content quality (headlines, copy, CTR), or run any measurable improvement loop. Requires: a target file, an evaluation command that outputs a metric, and a git repo.
This skill should be used when the user asks to "run autoresearch", "optimize X in a loop", "set up autonomous experiments", "start autoresearch", "optimize X overnight", or "experiment loop". Sets up and runs an autonomous experiment loop for any optimization target.
Autonomously optimize any Claude Code skill by running it repeatedly, scoring outputs against binary evals, mutating the prompt, and keeping improvements. Based on Karpathy's autoresearch methodology. Use when: optimize this skill, improve this skill, run autoresearch on, make this skill better, self-improve skill, benchmark skill, eval my skill, run evals on. Outputs: an improved SKILL.md, a results log, and a changelog of every mutation tried.
Karpathy-inspired autonomous research loop. Agent edits one file, evals, keeps or discards, repeats. Plateau-triggered web search breaks through ceilings. Git as state machine. Runs until stopped or budget exhausted.
Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.
Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator.
Optimizes algorithms via autoresearch loop: benchmark, research, hypothesize, keep/discard
Run the hive experiment loop — autonomous iteration on a shared task. Use when the agent is in a hive task directory and needs to run experiments, submit results, or participate in the swarm. Triggers on "hive", "run hive", "autoresearch", "start experimenting", "join the swarm", "start the loop", or when .hive/task file is detected.
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each against hard gates and/or LLM-as-judge quality scores, keep improvements, and converge toward the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation. Inspired by Karpathy's autoresearch, generalized for multi-file code changes and non-ML domains.
Use when user wants autonomous iteration on any task — improving metrics, completing features, running experiments, optimizing code, or working unattended. Make sure to use this skill whenever someone mentions autoresearch, autonomous loops, iterating until done, running overnight, keep improving, hill-climbing, or any measurable improvement goal, even if they don't explicitly ask for a 'loop'.