Search Results: observability

Found 456 Skills

DevOps & Cloud Servicesancoleman/ai-design-compo...

implementing-observability

Monitoring, logging, and tracing implementation using OpenTelemetry as the unified standard. Use when building production systems requiring visibility into performance, errors, and behavior. Covers OpenTelemetry (metrics, logs, traces), Prometheus, Grafana, Loki, Jaeger, Tempo, structured logging (structlog, tracing, slog, pino), and alerting.

🇺🇸|EnglishTranslated

4 scripts/Checked

DevOps & Cloud Servicesatalovesyou/claude-skills...

logging-observability

Guidelines for structured logging, distributed tracing, and debugging patterns across languages. Covers logging best practices, observability, security considerations, and performance analysis.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicestheneoai/awesome-skills

datadog-observability--security-platform

Expert skill for Datadog Observability & Security Platform

🇺🇸|EnglishTranslated

Backend Developmentjabrena/cursor-rules-java

182-java-observability-metrics-micrometer

Use when you need to implement or improve Java metrics observability with Micrometer — including meter design, naming/tag conventions, cardinality control, timers/counters/gauges/distribution summaries, percentiles/histograms, Actuator/Prometheus integration, and metrics validation through tests. This should trigger for requests such as Improve metrics; Apply Micrometer; Add metrics observability; Refactor Micrometer instrumentation. Part of cursor-rules-java project

🇺🇸|EnglishTranslated

AI & Machine Learningassistant-ui/skills

observability

Adds tracing, telemetry, and observability to an assistant-ui backend. Use when wiring an AI SDK route handler (streamText/generateText, toUIMessageStreamResponse) to a tracing backend: Langfuse via OpenTelemetry (LangfuseSpanProcessor and NodeSDK in instrumentation.ts, experimental_telemetry isEnabled, propagateAttributes with traceName/userId/sessionId, langfuseSpanProcessor.forceFlush on serverless), LangSmith via wrapAISDK(ai) from langsmith/experimental/vercel (createLangSmithProviderOptions, awaitPendingTraceBatches), or Helicone via createOpenAI baseURL https://oai.helicone.ai/v1 with the Helicone-Auth header. Also covers rendering collected spans with @assistant-ui/react-o11y headless primitives (SpanResource, SpanPrimitive Root/Indent/CollapseToggle/StatusIndicator/TypeBadge/Name/Children, SpanByIndexProvider, SpanData/SpanState) mounted via useAui/AuiProvider from @assistant-ui/store. Use for missing or empty traces, edge vs nodejs runtime telemetry, serverless flush issues, or trace waterfalls.

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

phoenix-observability

Open-source AI observability platform for LLM tracing, evaluation, and monitoring. Use when debugging LLM applications with detailed traces, running evaluations on datasets, or monitoring production AI systems with real-time insights.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesgoogle/skills

google-cloud-recipe-networking-observability

Investigates Google Cloud networking issues by analyzing logs, metrics, and diagnostics. Use when investigating VPC Flow Logs, NAT, firewall, or threat logs, querying latency and throughput metrics, or running Connectivity Tests for path diagnostics.

🇺🇸|EnglishTranslated

Testing & QAsickn33/antigravity-aweso...

api-testing-observability-api-mock

You are an API mocking expert specializing in realistic mock services for development, testing, and demos. Design mocks that simulate real API behavior and enable parallel development.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessickn33/antigravity-aweso...

observability-monitoring-slo-implement

You are an SLO (Service Level Objective) expert specializing in implementing reliability standards and error budget-based practices. Design SLO frameworks, define SLIs, and build monitoring that balances reliability with delivery velocity.

🇺🇸|EnglishTranslated

DevOps & Cloud Serviceselastic/agent-skills

observability-k8s-investigation

Investigate Kubernetes workload, node, and control-plane issues using OTel telemetry (EDOT). Use when diagnosing pod failures (CrashLoopBackOff, OOMKilled, Error), node pressure, resource exhaustion, image pull failures, admission rejections, autoscaling anomalies, or correlating K8s state with application signals. OTel ingest path only — the legacy ECS Kubernetes integration shape is out of scope.

🇺🇸|EnglishTranslated

Backend Developmentelastic/agent-skills

observability-edot-dotnet-migrate

Migrate a .NET application from the classic Elastic APM .NET agent to the EDOT .NET SDK. Use when switching from Elastic.Apm.* packages to Elastic.OpenTelemetry.

🇺🇸|EnglishTranslated

DevOps & Cloud Serviceselastic/agent-skills

observability-edot-python-migrate

Migrate a Python application from the classic Elastic APM Python agent to the EDOT Python agent. Use when switching from elastic-apm to elastic-opentelemetry.

🇺🇸|EnglishTranslated