Loading...
Loading...
Found 1,573 Skills
Calibrate an LLM judge against human labels using data splits, TPR/TNR, and bias correction. Use after writing a judge prompt (write-judge-prompt) when you need to verify alignment before trusting its outputs. Do NOT use for code-based evaluators (those are deterministic; test with standard unit tests).
Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.
INVOKE THIS SKILL when building evaluation pipelines for LangSmith. Covers three core components: (1) Creating Evaluators - LLM-as-Judge, custom code; (2) Defining Run Functions - how to capture outputs and trajectories from your agent; (3) Running Evaluations - locally with evaluate() or auto-run via LangSmith. Uses the langsmith CLI tool.
💰 Save Token | Token 节省器 TRIGGERS: Use when token cost is high, conversation is long, files read multiple times, or before complex tasks. Guiding skill that helps agents identify and avoid sending duplicate context to LLM APIs. Teaches agents to recognize repeated content and summarize instead of re-sending. 触发条件:Token 成本高、对话长、文件多次读取、复杂任务前。 指导 Agent 识别重复内容,避免重复发送,从而节省 Token。
Use this skill when optimizing for AI-powered search engines and generative search results - Google AI Overviews, ChatGPT Search (SearchGPT), Perplexity, Microsoft Copilot Search, and other LLM-powered answer engines. Covers Generative Engine Optimization (GEO), citation signals for AI search, entity authority, LLMs.txt specification, and LLM-friendliness patterns based on Princeton GEO research. Triggers on visibility in AI search, getting cited by LLMs, or adapting SEO for the AI search era.
Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust boundary violations, conditional side effects, and other structural issues.
E-commerce warehouse and inventory optimization advisor. Analyzes inventory health, calculates safety stock and reorder points, performs ABC analysis, evaluates fulfillment costs, and provides actionable recommendations for improving efficiency. Supports all major fulfillment models: Self-fulfillment, Amazon FBA/FBM, Walmart WFS, 3PL, Shopify Fulfillment, TikTok Shop, Dropshipping, and Hybrid setups. No API key required. Use when: (1) reducing stockouts or overstock, (2) calculating safety stock levels, (3) optimizing warehouse costs, (4) improving Amazon IPI score, (5) analyzing inventory KPIs.
Add Olakai monitoring to existing AI code — wrap your LLM client, configure custom KPIs, and validate the integration end-to-end
Run a free 35B AI coding agent on Apple Silicon Macs using local LLMs via llama.cpp or MLX with web search, shell, and file tools.
Auto-Claude Graphiti memory system configuration and usage. Use when setting up memory persistence, configuring LLM/embedding providers, querying knowledge graph, or optimizing memory performance.
Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.
This skill should be used when the user asks to "fix the issues", "optimize existing content", "create new content for AI visibility", "run Morphiq Build", "generate schema markup", "create an llms.txt file", "run the content lab", or mentions building content fixes, generating schema, rewriting content for AI citations, or creating policy files. Consumes a Prioritized Roadmap (or user prompt, or existing content) and produces build artifacts through a 6-step content lab pipeline.