Loading...
Loading...
Found 19 Skills
This skill should be used when the user asks to "diagnose context problems", "fix lost-in-middle issues", "debug agent failures", "understand context poisoning", or mentions context degradation, attention patterns, context clash, context confusion, or agent performance degradation. Provides patterns for recognizing and mitigating context failures.
Preserve critical session state when compacting context. Use when context window is filling up and you need to summarize/reduce while keeping essential debugging information.
Read production traces, identify what's failing, and build failure taxonomies using open coding and axial coding methodology. Use when debugging agent or pipeline quality, investigating "why are my outputs bad?", or before building any evaluator — error analysis must come first. Do NOT use when you already have identified failure modes and need evaluators (use build-evaluator) or datasets (use generate-synthetic-dataset).
Multi-agent investigation for stubborn bugs. Use when: going in circles debugging, need to investigate browser/API interactions, complex bugs resisting normal debugging, or when symptoms don't match expectations. Launches parallel agents with different perspectives and uses Chrome tools for evidence gathering.
Systematically debug issues, investigating bugs, troubleshooting problems, or tracking down errors with persistent state across context resets. Triggers include "debug", "investigate bug", "troubleshoot", "find the problem", "why isn't this working", and "debug session".
Use when diagnosing agent failures, debugging lost-in-middle issues, understanding context poisoning, or asking about "context degradation", "lost in middle", "context poisoning", "attention patterns", "context clash", "agent performance drops"
Debugs errors and traces failures in AI agents and their tools. Use this skill when the user says: "the agent is failing", "tool call not working", "error in the pipeline", "debug this", "why is the agent doing X instead of Y", "trace the execution", "agent is stuck", "infinite loop", "model response won't parse", "context overflow". Identifies context errors, infinite loops, malformed tool calls, response parsing issues and subagent conflicts.
This skill should be used when the user asks to "diagnose context problems", "fix lost-in-middle issues", "debug agent failures", "understand context poisoning", or mentions context degradation, attention patterns, context clash, context confusion, or agent performance degradation. A core context engineering skill — also activates when the user mentions "context engineering" or "context-engineering" in the context of diagnosing and mitigating context failures.
ABSOLUTE MUST to debug and inspect LLM/AI agent traces using PostHog's MCP tools. Use when the user pastes a trace URL (e.g. /llm-observability/traces/<id>), asks to debug a trace, figure out what went wrong, check if an agent used a tool correctly, verify context/files were surfaced, inspect subagent behavior, investigate LLM decisions, or analyze token usage and costs.
Agentica server + Claude proxy setup - architecture, startup sequence, debugging
Instruments Python and TypeScript code with MLflow Tracing for observability. Triggers on questions about adding tracing, instrumenting agents/LLM apps, getting started with MLflow tracing, or tracing specific frameworks (LangGraph, LangChain, OpenAI, DSPy, CrewAI, AutoGen). Examples - "How do I add tracing?", "How to instrument my agent?", "How to trace my LangChain app?", "Getting started with MLflow tracing", "Trace my TypeScript app"
This skill should be used when inspecting, analyzing, or querying Claude Code session logs. Use when users ask about session history, want to find sessions, analyze context usage, extract tool call patterns, debug agent execution, or understand what happened in previous sessions. Essential for understanding Claude Code's ~/.claude/projects/ structure, JSONL session format, and the erk extraction pipeline.