Dstl8 — AI-Native Observability Skill
Dstl8 distills logs across dev, staging, and production into root cause
analysis, impact assessment, and fix recommendations. All environments
queryable via the Dstl8 MCP server using the same tools.
Setup gate
Before running any workflow below, verify Dstl8 is set up:
- CLI installed and authenticated ( shows an active profile)
- At least one source connected and ingesting ( lists it)
- MCP server installed and the AI client restarted ()
If any of these are missing,
read from this skill directory
and complete setup first. Do not attempt setup from memory.
If Dstl8 tools aren't visible even after setup is reportedly complete:
"I don't see a Dstl8 MCP server connected. Check
,
restart your AI client, or re-run setup. See
."
Tool surface preference
This skill exposes Dstl8 functionality through two surfaces. Default
correctly between them; the wrong choice wastes turns and produces
worse answers.
MCP tools (
,
,
,
,
,
, etc.)
are the right surface for
investigation, queries, incident triage,
and any run-time use of the data. These are the high-leverage tools
the user installed Dstl8 to get. Default here for any question shaped
like "show me X", "what happened with Y", "why is Z broken",
"investigate W", "did my deploy fix it", "what's going on in prod".
CLI via bash (
,
,
,
, etc.) is for
setup, configuration, source
management, and installation. Rare, admin-flavored actions.
If a user explicitly asks for the CLI ("run dstl8 sources" / "use the
CLI to..."), use bash. Otherwise, when both surfaces could serve the
question, prefer MCP.
via bash is a fallback for
when MCP is unavailable, not a default.
When MCP isn't loaded, prefer asking the user to restart over substituting via CLI. If the user asks an investigation question and MCP tools aren't available in the session (e.g., they just signed up and Claude Code hasn't been restarted yet), tell them directly: "MCP tools aren't loaded in this session — restart Claude Code and ask again." Don't paper over it with parallel
calls. That produces a degraded answer and burns turns. CLI fallback is fine for setup verification (e.g.,
to confirm ingestion), but not for investigation flows.
Starting moves
Most workflows start with one of these:
| Start with | When |
|---|
| You need to discover available environments, services, or time ranges. Good default first call. |
| "What's going on?" — get active incidents |
| Quick health pulse across services |
| + severity filter | "Why is X broken?" — find specific errors |
Entry patterns
"What's going on?" — Situational awareness
→
(active, filtered by environment if specified) →
by service. Present active incidents + health across environments.
When the user names a specific environment, pass it as a filter — don't return all incidents
and let them sort through it.
"Why is X broken?" — Targeted investigation
(service + keyword + error) →
(recurring?) →
(already tracked?). Then cross-environment: does the same pattern
appear in other environments? Same error in local + staging + prod = systematic.
Only in prod = environment-specific. Present: root cause → impact → fix.
"Check staging" / "Check production" / "Check <env>" — Environment-specific
→
for that environment →
→
. Compare against production baseline.
New pattern in staging not in prod = flag before promoting. Pattern in a dev
environment matching a known prod incident = good signal, developer is
reproducing it. Present with "safe to promote" or "flag before promoting" verdict.
"Did my deploy fix it?" — Verification
to anchor windows →
before vs after →
same windows →
. If deployed to staging,
compare staging post-deploy vs production — are they converging? Clear verdict.
"I'm about to make changes" — Pre-coding context (Loop 1)
for the service →
across all environments →
for recent issues. Surface what the developer should know
before writing code.
Defensive patterns
- Several tools require . , , , , all need a parameter (typically or ). They'll fail without it.
- CRITICAL: MUST include a state or time range filter. Unfiltered calls return 10-15k tokens and blow up context. NEVER call without passing (e.g. ) or / timestamps. If the filtered response is still large (>5k tokens), use a narrower time window or pipe the response through a local script to extract what you need rather than re-fetching.
- Discover, don't guess. Call when unsure about environment or service names.
- CLI time flag is , not . and accept (e.g., , , ) and . Don't use , , or other common variants — they don't exist on this CLI and will error.
- Respect environment scope. When the user specifies an environment ("in brewhaus", "check staging"), filter queries to that environment. Cross-environment data is supplementary context, not the main answer. When no environment is specified, infer from git branch, repo name, or conversation context. Only ask if you can't determine it.
- Always think cross-environment. When investigating one environment, check if the same pattern exists in others. But respect the user's scope — if they ask about a specific environment, lead with that environment's data and present cross-environment findings as secondary context, not the primary answer.
- Persist findings. After triage reaching root cause, write to the knowledge graph. This feeds future sessions.
- Verify after fixing. Proactively offer before/after comparison post-deploy.
- Check before creating. Search for existing incidents/entities before creating — Möbius may have already created them. Ask the user before creating incidents.
- Convert timestamps. Always present human-readable times, not raw Unix.
Incident status mapping
| Code | Label |
|---|
| 0 | Open |
| 1 | Investigating |
| 2 | Active |
| 3 | Resolved |
| 4 | Closed |
Output conventions
Present investigation results as: Summary (one sentence) → Root cause
→ Impact (quantified) → Recommended fix (concrete) → Confidence level.
Default to roughly 250 words. Expand to a longer post-mortem format only
when the user explicitly asks for one ("write up a full post-mortem," "give
me the long version"). For routine investigation queries, brevity beats
thoroughness — the user is iterating, not archiving.
For post-mortems add: timeline table, action items with owner/priority.
Feedback loops
Three loops drive compounding value:
- Loop 1 (Intent): Before coding — surface past incidents and patterns via knowledge graph. Only works if Loop 3 persisted findings.
- Loop 2 (Iteration): During dev — validate in dev environments and staging before promoting to production. Cross-environment comparison catches regressions.
- Loop 3 (Production intelligence): After deploy — triage → fix → verify → persist to graph for Loop 1.