Loading...
Loading...
Found 52 Skills
Build a structured taxonomy of failure modes from open-coded trace annotations. Use this skill whenever the user has freeform annotations from reviewing LLM traces and wants to cluster them into a coherent, non-overlapping set of binary failure categories (axial coding). Also use when the user mentions "failure modes", "error taxonomy", "axial coding", "cluster annotations", "categorize errors", "failure analysis", or wants to go from raw observation notes to structured evaluation criteria. This skill covers the full pipeline: grouping open codes, defining failure modes, re-labeling traces, and quantifying error rates.
Synthesize unstructured thinking into a structured, actionable plan. Use when user provides stream-of-consciousness thoughts, scattered notes, or a brain dump and needs them organized into a coherent plan with goals, actions, and priorities. Trigger phrases: "synthesize", "organize my thoughts", "turn this into a plan", "make sense of this", "structure this", "formalize these notes", "what should I do with all this".
Write, review, and improve prompts for any LLM — Claude, GPT, Gemini, Llama, DeepSeek, Mistral, Cohere, Qwen, Grok, Nova, and more. Use when the user asks to "write a system prompt", "improve this prompt", "review my prompt", "make a prompt for", "optimize my prompt", "fix my prompt", "why isn't my prompt working", or wants help writing better prompts for any AI model. Also use when building agents, chatbots, or AI assistants that need system-level instructions, or when the user has a bad prompt they want rewritten. Covers system prompts, task prompts, tool descriptions, and general prompt improvement across all major model families.
Systematic pre-publication manuscript audit producing a structured refactoring report with section-level diagnostics, citation hygiene analysis, and submission-readiness assessment. Use this skill whenever the user uploads a manuscript, paper, thesis chapter, journal submission, or conference paper and asks for review, feedback, editing, refactoring, pre-submission check, proofreading, or quality audit. Also trigger when the user says "review my paper", "check before submission", "is this ready to submit", "pre-pub checklist", "manuscript review", "refactor my paper", or asks about citation consistency, argument coherence, or formatting compliance. Covers partial requests like "check my references" or "does the abstract work" — the full diagnostic surfaces issues across all facets even when only one was asked about.
Critical analysis of research papers, academic manuscripts, preprints, and technical studies — evaluating methodology, claims-evidence alignment, contribution significance, and intellectual honesty. Produces coherent analytical responses (not checklists) that distinguish genuine weaknesses from standard field limitations. Governs intellectual posture: collegial reader, not adversarial reviewer. Triggers on: "critique this paper", "review this research", "what do you think of this paper", "analyze this study", "evaluate the methodology", "is this paper sound", "assess this research", "strengths and weaknesses of this paper", "does the evidence support the claims". Use this skill when the user provides a research paper, preprint, or technical study and asks for critical evaluation of its scientific merit, methodology, or contribution — not formatting, citation hygiene, or submission readiness (use manuscript-review for those).
Orchestrates end-to-end interview preparation for senior ML/AI engineers targeting Anthropic and peer companies. Use for prep timeline generation, story coherence across rounds, mock scheduling, and debrief analysis. Activate on "interview prep", "interview loop", "Anthropic interview", "prep timeline". NOT for resume writing, career narratives, or individual round-type practice.
Multi-agent distributed context preservation protocol using cryptographic sharding, gossip propagation, and Byzantine fault tolerance to maintain coherent shared memory across dynamic agent networks.
Perform a high-level flow audit of an implementation plan, analyzing phase-to-phase dependencies, data flow consistency, ordering logic, stale artifacts, and risk assessment. Use when asked to 'audit the plan', 'check plan flow', 'review plan dependencies', 'find plan discrepancies', or 'assess plan coherence'. Do NOT use for per-phase template compliance (use /review-plan) or creating plans (use /create-plan).
Generate complete academic survey papers using multi-LLM parallel outline generation, RAG-based subsection writing, citation validation, and local coherence enhancement. Based on AutoSurvey pipeline. Use for writing comprehensive literature surveys.
Map user missions from trigger to value moment, organizing features into coherent paths during PRD v0.4 User Journeys. Triggers on requests to map user journeys, define user flows, describe how users accomplish goals, or when user asks "map user journeys", "define user flows", "user missions", "how do users accomplish X?", "journey mapping", "what steps do users take?", "pain to value flow". Consumes PER- (Persona Definition), FEA- (Feature Value Planning), KPI- (Outcome Definition). Outputs UJ- entries with step flows, pain points, and value moments. Feeds v0.4 Screen Flow Definition.
Use when synthesizing multiple sources into coherent knowledge bases, performing multi-source analysis, or creating topic expertise from URLs and files. Also use when encountering content integration tasks requiring connections across disparate materials.
LLM-as-judge evaluation framework with 5-dimension rubric (accuracy, groundedness, coherence, completeness, helpfulness) for scoring AI-generated content quality with weighted composite scores and evidence citations