Loading...
Loading...
Found 24 Skills
Systematic debugging for ADK agents — trace reading, log analysis, common failure diagnosis, and the debug loop.
ABSOLUTE MUST to debug and inspect LLM/AI agent traces using PostHog's MCP tools. Use when the user pastes a trace URL (e.g. /llm-observability/traces/<id>), asks to debug a trace, figure out what went wrong, check if an agent used a tool correctly, verify context/files were surfaced, inspect subagent behavior, investigate LLM decisions, or analyze token usage and costs.
Implement distributed tracing with Jaeger and Zipkin for tracking requests across microservices. Use when debugging distributed systems, tracking request flows, or analyzing service performance.
Guidance for interpreting SPAA (Stack Profile for Agentic Analysis) files. Provides information on the file format, as well as tips on how to use it to identify performance bottlenecks, memory leaks, or opportunities for optimization. Use when the user is trying to read a .spaa file to understand the performance of an application.
Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.
Help the user systematically identify and categorize failure modes in an LLM pipeline by reading traces. Use when starting a new eval project, after significant pipeline changes (new features, model switches, prompt rewrites), when production metrics drop, or after incidents.
Analyze VictoriaMetrics query trace JSON to diagnose slow queries and produce a structured performance report with time breakdown, bottleneck analysis, and optimization recommendations. ALWAYS use this skill when: (1) the user mentions a VictoriaMetrics or VM trace, query trace, or trace JSON, (2) the user provides or references a JSON file containing duration_msec/message/children fields, (3) the user asks why a VictoriaMetrics/VM query is slow and has trace output, (4) the user asks about vmstorage node distribution, cache misses, or rollup performance in the context of a trace, (5) the user mentions vmselect trace, trace=1, or query performance debugging with VictoriaMetrics. This skill provides a structured report template that ensures consistent, thorough analysis — do not attempt to analyze VM traces without it.
Use when your agent or environment is broken — wrong answers, errors, timeouts, tool failures, or CLI issues. Reads traces and logs to diagnose root causes. Also checks prerequisites when the CLI itself isn't working. Triggers on: "agent not working", "wrong answer", "agent error", "tool call failing", "debug agent", "check logs", "read traces", "broken", "500 error", "424 error", "model access denied", "command not found", "stuck in DELETING", "maxVms exceeded", "cold start diagnosis", "cold start slow", "agentcore create error", "create failed", "exit code 7", "connection refused local dev". Not for deploy failures — use agents-deploy. Not for performance tuning without errors — use agents-optimize. Not for VPC configuration — use agents-build. Not for observability setup or missing logs — use agents-optimize.
Debug errors, test failures, and unexpected behavior with log analysis and correlation. Use when encountering issues, error messages, analyzing logs, or investigating production errors.
Analyzes a single MLflow trace to answer a user query about it. Use when the user provides a trace ID and asks to debug, investigate, find issues, root-cause errors, understand behavior, or analyze quality. Triggers on "analyze this trace", "what went wrong with this trace", "debug trace", "investigate trace", "why did this trace fail", "root cause this trace".
Query Scout APM performance data via REST API. Use when investigating app performance, slow endpoints, error groups, traces, or insights like N+1 queries and memory bloat.
INVOKE THIS SKILL when optimizing, improving, or debugging LLM prompts using production trace data, evaluations, and annotations. Covers extracting prompts from spans, gathering performance signal, and running a data-driven optimization loop using the ax CLI.