Loading...
Loading...
Found 39 Skills
LLM guardrails with NeMo, Guardrails AI, and OpenAI. Input/output rails, hallucination prevention, fact-checking, toxicity detection, red-teaming patterns. Use when building LLM guardrails, safety checks, or red-team workflows.
Adaptive exploration pipeline that integrates /brainstorm, /think, and /red-team with intelligent pivoting. Unlike /deepthink (which takes a fixed idea and iterates), /prospect starts with divergent brainstorming, picks the most promising vein, runs deep analysis, and — crucially — can PIVOT back to divergent thinking when: the idea dies under red-team, an adjacent opportunity surfaces during analysis, or the research reveals the real opportunity is elsewhere. Produces a prospecting report: the landscape explored, veins assayed, pivots taken, and the final stake with conviction. Use when the user says "prospect", "explore this space", "find opportunities", "what should I build", "explore and analyze", or has a domain/trend they want to both explore AND evaluate.
Find every way users can break your AI before they do. Use when you need to red-team your AI, test for jailbreaks, find prompt injection vulnerabilities, run adversarial testing, do a safety audit before launch, prove your AI is safe for compliance, stress-test guardrails, or verify your AI holds up against adversarial users. Covers automated attack generation, iterative red-teaming with DSPy, and MIPROv2-optimized adversarial testing.
Red-team security review for code changes. Use when reviewing pending git changes, branch diffs, or new features for security vulnerabilities, permission gaps, injection risks, and attack vectors. Acts as a pen-tester analyzing code.
Real-time monitoring and detection of adversarial attacks and model drift in production
Techniques to test and bypass AI safety filters, content moderation systems, and guardrails for security assessment
Implementing safety filters, content moderation, and guardrails for AI system inputs and outputs
End-to-end deep research and analysis pipeline. Takes a raw idea or market question, conducts deep web research, builds a competitive landscape, runs multi-framework intelligence analysis (/think), stress-tests it (/red-team), researches the red team findings, re-thinks with adversarial data, re-red-teams, and iterates until divergence between think and red-team is low (conviction stabilizes). Then generates a comprehensive single-file HTML report with all findings: market landscape, competitive analysis, intelligence briefs, red team results, how to win, and how you could lose. Use when the user says "/deepthink", "deep think", "deep research", or wants a comprehensive research-to-report pipeline on any idea, market, or strategic question.
Senior Code Architect & Quality Assurance Engineer for 2026. Specialized in context-aware AI code reviews, automated PR auditing, and technical debt mitigation. Expert in neutralizing "AI-Smells," identifying performance bottlenecks, and enforcing architectural integrity through multi-job red-teaming and surgical remediation suggestions.
Apply structured critical thinking — identifying claims, evidence, reasoning chains, hidden assumptions, and logical fallacies — to evaluate or construct specific written arguments rigorously. Use this skill when the user presents a concrete argument, claim, op-ed, research finding, or piece of reasoning to be analyzed for logical validity or flaws, even if they say 'is this argument valid', 'what logical fallacies are in this', or 'what assumptions am I making in this thesis'. Do NOT use for casual plan review, trip planning, project risk brainstorming, or pre-mortems — 'poke holes in my plan' requests are red-team / risk review, not argument analysis.
Provides calibrated decision analysis using Charlie Munger-style multiple mental models, inversion, incentive mapping, circle-of-competence checks, misjudgment audits, second-order effects, and forecast updates. Use when the user asks for an oracle take, a hard call, a decision memo, a premortem, an outside view, a red-team, a sanity-check, what am I missing, think this through, or wants a strategy, hire, investment, plan, product, partnership, or major life choice analysed. Avoid for simple factual lookups or time-sensitive legal, medical, or market questions without fresh evidence.
Answers AI agent evaluation methodology questions with practical, opinionated guidance grounded primarily in Microsoft's agent evaluation ecosystem (MS Learn, Eval Scenario Library, Triage & Improvement Playbook, Eval Guidance Kit) supplemented by select industry sources.