Search Results: llm-observability

Found 32 Skills

exploring-llm-traces

ABSOLUTE MUST to debug and inspect LLM/AI agent traces using PostHog's MCP tools. Use when the user pastes a trace URL (e.g. /llm-observability/traces/<id>), asks to debug a trace, figure out what went wrong, check if an agent used a tool correctly, verify context/files were surfaced, inspect subagent behavior, investigate LLM decisions, or analyze token usage and costs.

🇺🇸|EnglishTranslated

6 scripts/Checked

AI & Machine Learningarize-ai/arize-skills

arize-link

Generate deep links to traces, spans, and sessions in the Arize UI. Use when the user wants a clickable URL to open a specific trace, span, or session.

🇺🇸|EnglishTranslated

AI & Machine Learninglangfuse/skills

langfuse

Interact with Langfuse and access its documentation. Use when needing to (1) query or modify Langfuse data programmatically via the CLI — traces, prompts, datasets, scores, sessions, and any other API resource, (2) look up Langfuse documentation, concepts, integration guides, or SDK usage, or (3) understand how any Langfuse feature works. This skill covers CLI-based API access (via npx) and multiple documentation retrieval methods.

🇺🇸|EnglishTranslated

AI & Machine Learninggetsentry/sentry-agent-sk...

sentry-setup-ai-monitoring

Setup Sentry AI Agent Monitoring in any project. Use when asked to monitor LLM calls, track AI agents, or instrument OpenAI/Anthropic/Vercel AI/LangChain/Google GenAI. Detects installed AI SDKs and configures appropriate integrations.

🇺🇸|EnglishTranslated

AI & Machine Learningorq-ai/assistant-plugins

analyze-trace-failures

Read production traces, identify what's failing, and build failure taxonomies using open coding and axial coding methodology. Use when debugging agent or pipeline quality, investigating "why are my outputs bad?", or before building any evaluator — error analysis must come first. Do NOT use when you already have identified failure modes and need evaluators (use build-evaluator) or datasets (use generate-synthetic-dataset).

🇺🇸|EnglishTranslated

AI & Machine Learningakillness/oh-my-skills

langsmith

Instrument, trace, evaluate, and monitor LLM applications and AI agents with LangSmith. Use when setting up observability for LLM pipelines, running offline or online evaluations, managing prompts in the Prompt Hub, creating datasets for regression testing, or deploying agent servers. Triggers on: langsmith, langchain tracing, llm tracing, llm observability, llm evaluation, trace llm calls, @traceable, wrap_openai, langsmith evaluate, langsmith dataset, langsmith feedback, langsmith prompt hub, langsmith project, llm monitoring, llm debugging, llm quality, openevals, langsmith cli, langsmith experiment, annotate llm, llm judge.

🇺🇸|EnglishTranslated

2 scripts/Attention

AI & Machine Learningorq-ai/assistant-plugins

setup-observability

Set up orq.ai observability for LLM applications. Use when setting up tracing, adding the AI Router proxy, integrating OpenTelemetry, auditing existing instrumentation, or enriching traces with metadata.

🇺🇸|EnglishTranslated

AI & Machine Learningportkey-ai/skills

portkey-typescript-sdk

Integrate Portkey AI Gateway into TypeScript/JavaScript applications. Use when building LLM apps with observability, caching, fallbacks, load balancing, or routing across 200+ LLM providers.

🇺🇸|EnglishTranslated

AI & Machine Learningcomet-ml/opik-skills

opik

Opik observability for LLM agents — Agent Configuration, Local Runner (opik connect), Evaluation Suites, threads, integrations. Use for "configure my agent", "connect my agent", "evaluate my agent" or "integrate with Opik".

🇺🇸|EnglishTranslated

AI & Machine Learninglangwatch/skills

tracing

Add LangWatch tracing and observability to your code. Use for both onboarding (instrument an entire codebase) and targeted operations (add tracing to a specific function or module). Supports Python and TypeScript with all major frameworks.

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/trending-skills

future-agi-platform

Expert skill for using Future AGI — the open-source end-to-end platform for evaluating, observing, and improving LLM and AI agent applications with tracing, evals, simulations, datasets, gateway, and guardrails.

🇺🇸|EnglishTranslated

AI & Machine Learningdatadog-labs/agent-skills

llm-obs-trace-rca

Root cause analysis on production LLM traces. Diagnoses why an LLM application is failing — works from eval judge verdicts, runtime errors, or structural anomalies depending on what signals are present. Walks the span tree from symptom to root cause. Use when user says "what's wrong with my app", "why is my eval failing", "analyze errors", "root cause analysis", "diagnose failures", or wants to understand production failure patterns.

🇺🇸|EnglishTranslated

3 scripts/Checked