dynamo-router-starter

Original：🇺🇸 English

Translated

1 scriptsChecked / no sensitive code detected

Start or patch Dynamo router modes and run router endpoint smoke checks. Use for round-robin, KV-aware, least-loaded, or device-aware routing setup; use recipe-runner for recipe deployment and troubleshoot for failure diagnosis.

4installs

Sourcenvidia/skills

Added on2026-05-30

NPX Install

npx skill4agent add nvidia/skills dynamo-router-starter

SKILL.md Content

View Translation Comparison →

Dynamo Router Starter

Purpose

Make Dynamo routing feel easy by getting a baseline router mode running, enabling KV-aware routing when appropriate, and proving the endpoint works. Keep the user focused on exact commands and success signals, not router internals.

Prerequisites

Python 3.10+ with the
```
dynamo
```
package importable (
```
python3 -m dynamo.frontend --help
```
works).
For Kubernetes runs:
```
kubectl
```
configured with access to the target namespace and a deployed Dynamo recipe.
Network reachability to the frontend service (port-forward or direct).
A model already loaded into at least one worker (
```
/v1/models
```
returns at least one entry).

Required Inputs

Collect or infer:

local Python/CLI or Kubernetes recipe path

desired mode:

round-robin

,

kv

,

least-loaded

,

device-aware-weighted

,

direct

, or

random

frontend port or Kubernetes frontend service
whether workers publish KV events; if not, use approximate KV mode
model name for smoke requests, if
```
/v1/models
```
cannot discover it

Instructions

1. Establish A Baseline

For local bring-up with already registered workers:

bash

python3 -m dynamo.frontend --router-mode round-robin --http-port 8000

For Kubernetes, inspect the selected recipe

deploy.yaml

and locate the frontend service. If the recipe is not already deployed, use

dynamo-recipe-runner

first.

2. Enable KV Routing

For local frontend:

bash

python3 -m dynamo.frontend --router-mode kv --http-port 8000

For Kubernetes, patch only the frontend service env:

yaml

envs:
  - name: DYN_ROUTER_MODE
    value: kv

If backend workers are not publishing KV cache events, set approximate mode instead of leaving the router waiting for events:

yaml

envs:
  - name: DYN_ROUTER_USE_KV_EVENTS
    value: "false"

3. Smoke Test

After port-forwarding the frontend service or starting local frontend, run:

bash

python3 scripts/check_router_health.py \
  --base-url http://127.0.0.1:8000

This must verify

/v1/models

and, when a model is discoverable, one

/v1/chat/completions

request.

4. Compare Modes Carefully

When comparing round-robin vs KV routing:

use the same model, workers, prompt set, concurrency, and sampling settings
send repeated-prefix prompts if demonstrating KV reuse
label the result as a smoke comparison unless enough benchmark samples were collected
do not claim throughput improvement from a single chat request

If the endpoint is unhealthy or workers are missing, switch to

dynamo-troubleshoot

.

Available Scripts

Script	Purpose	Arguments
`scripts/check_router_health.py`	Smoke-test `/v1/models` and one chat completion against a Dynamo frontend	`--base-url` , `--retries` , `--timeout`

Invoke via the agentskills.io

run_script()

protocol:

python

run_script("scripts/check_router_health.py", args=["--base-url", "http://127.0.0.1:8000"])

Examples

Local KV-routed frontend on port 8000, then smoke-test it:

bash

python3 -m dynamo.frontend --router-mode kv --http-port 8000 &
python3 scripts/check_router_health.py --base-url http://127.0.0.1:8000

Kubernetes-deployed frontend reachable via port-forward:

bash

kubectl port-forward svc/qwen-vllm-disagg-frontend 8000:8000 -n dynamo-demo &
python3 scripts/check_router_health.py --base-url http://127.0.0.1:8000 --retries 3

Equivalent through the agent protocol:

python

run_script("scripts/check_router_health.py", args=["--base-url", "http://127.0.0.1:8000", "--retries", "3"])

Output Contract

Return:

mode selected and why
local command or Kubernetes env patch
frontend service or URL
smoke-test result
any limitation, such as approximate KV mode or missing worker KV events
next command to run for a fuller comparison

Limitations

Smoke test is one chat completion; it is not a benchmark. Use
```
dynamo-benchmark
```
for throughput/latency numbers.
KV-aware mode without worker KV-event publication degrades to approximate mode; this skill flags but does not fix the underlying worker config.
Mode comparisons require matched workloads; cross-mode latency claims need separate benchmark runs.

Troubleshooting

Symptom	Likely cause	Next step
`/v1/models` returns empty list	No worker registered with the frontend	Verify worker pods are Ready; confirm they connect to the same etcd/NATS
Smoke chat request times out	Frontend up, workers not serving	Switch to `dynamo-troubleshoot` ; inspect worker logs
KV mode hangs	Workers do not publish KV cache events	Set `DYN_ROUTER_USE_KV_EVENTS=false` (approximate mode)
Connection refused on port-forward	Port-forward dropped or wrong service name	Re-run port-forward; verify the frontend service name matches the recipe

Benchmark

See

BENCHMARK.md

for the NVCARPS-EVAL performance report (auto-generated by the NVSkills CI pipeline). To refresh, re-run

/nvskills-ci

on an upstream PR touching this skill.

References

Read
```
references/router-modes.md
```
for the compact mode/env map.
Use
```
scripts/check_router_health.py
```
for endpoint smoke tests.

dynamo-router-starter

NPX Install

Tags

SKILL.md Content

Dynamo Router Starter

Purpose

Prerequisites

Required Inputs

Instructions

1. Establish A Baseline

2. Enable KV Routing

3. Smoke Test

4. Compare Modes Carefully

Available Scripts

Examples

Output Contract

Limitations

Troubleshooting

Benchmark

References