Loading...
Loading...
Found 33 Skills
Optimizes algorithms via autoresearch loop: benchmark, research, hypothesize, keep/discard
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each against hard gates and/or LLM-as-judge quality scores, keep improvements, and converge toward the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation. Inspired by Karpathy's autoresearch, generalized for multi-file code changes and non-ML domains.
Run the hive experiment loop — autonomous iteration on a shared task. Use when the agent is in a hive task directory and needs to run experiments, submit results, or participate in the swarm. Triggers on "hive", "run hive", "autoresearch", "start experimenting", "join the swarm", "start the loop", or when .hive/task file is detected.
Autonomous research agent that reads RESEARCH.md, infers what's needed, dynamically adjusts TODOs, and delegates to the right skill. Supports opt-in BFS mode for autonomous design space search. Respects a configurable supervision policy (presets: manual / checkpointed / autonomous / wild) governing notifications, approval gates, resource limits, and idea-change handling. Proactively surfaces gaps and asks before acting. Trigger phrases: "start research", "continue project", "what's next?", "explore design space", "autoresearch".
Runs ML experiments reproducibly — single runs or autonomous BFS batches. Single mode: isolated venv, time-budgeted, failure-handled, logs to RESEARCH.md. BFS mode (opt-in): designs N hypotheses, runs each for a fixed budget, compares via a single verifiable metric, keeps improvements and git-resets failures — fully autonomous until done. Respects the RESEARCH.md supervision policy for notifications, approvals, and stop limits. Trigger phrases: "run experiment", "train model", "explore design space", "find best config", "autoresearch".
Use when user wants autonomous iteration on any task — improving metrics, completing features, running experiments, optimizing code, or working unattended. Make sure to use this skill whenever someone mentions autoresearch, autonomous loops, iterating until done, running overnight, keep improving, hill-climbing, or any measurable improvement goal, even if they don't explicitly ask for a 'loop'.
Run AutoML / hyperparameter optimization (HPO) for NVIDIA TAO networks using AutoMLRunner. Handles algorithm selection (bayesian, hyperband, asha, bohb, llm, hybrid, autoresearch), WandB experiment tracking, job execution on any TAO SDK platform, result interpretation, and per-rec custom evaluation hooks. Use when the user mentions TAO AutoML, hyperparameter optimization, HPO, automl, automl_settings, AutoMLRunner, tao_automl, bayesian search, hyperband, ASHA, LLM-guided search, autoresearch, or wants to tune training hyperparameters for any TAO network. Platform-agnostic — runs on any SDK (Lepton, Brev, SLURM, Kubernetes, Docker).
Run a single experiment iteration. Edit the target file, evaluate, keep or discard.
Resume a paused experiment. Checkout the experiment branch, read results history, continue iterating.