nemotron-customize
IMPORTANT: Read this file before answering any
,
Nemotron customization, Curator curation, translation, SFT, PEFT, RL,
conversion, optimization, checkpoint or existing/hosted-endpoint evaluation, or
multi-step pipeline request. This applies whether the user names one step or
asks you to compose several steps into a pipeline.
Evaluation requests count even when no training is involved: "evaluate",
"benchmark", "smoke test", or "score" an existing/hosted endpoint, an API/model
ID, or a deployed model all route to
. Read this skill for
those too.
Purpose
Turn a model-customization request into a repo-native Nemotron step pipeline.
Plan the DAG, validate artifact wiring, and create only the YAML/config files
needed to run existing steps.
Use this skill only for inspecting, configuring, validating, running, or
submitting existing Nemotron steps or multi-step training/customization
pipelines. For frontend, dashboard, visualization, generic ML advice,
billing/access, or unrelated coding tasks, stop with a short scope note and do
not inspect the step catalog or edit files in that turn.
Prerequisites
- A checkout of the Nemotron repo with present; run from
the repo root.
- available to invoke
uv run nemotron steps ...
.
- For remote execution: an env profile TOML ( or
) with a section matching the selected step.
- For hosted services (translation, hosted eval): the auth environment variable
expected by the step (for example ), exported in the
environment — never inlined or committed.
- User-provided concrete values (model/checkpoint, data paths, output dir,
hardware/GPU count) before any command is presented as runnable.
Limitations
- Does not invent new catalog steps. When no existing step, runner, recipe, CLI,
or config can satisfy the request, it names the gap (Explorer mode) instead of
fabricating a step.
- Produces YAML/config for existing steps; new Python/shell is out of scope
except in Explorer mode after the gap is approved.
- Not for deployment-only/serving, frontend, dashboards, generic ML advice, or
non-Nemotron tasks.
- Does not guess concrete values (paths, model IDs, GPU counts, profiles); it
asks or returns when they are missing.
Core Rule
Use bundled references first. The
folder is the first decision
surface for routing, artifacts, patterns, hardware heuristics, and command
shape. Use
only as a live verification/fallback source
when you need exact current config fields, manifests, runner imports, or details
missing from bundled references.
If sources disagree:
- Checked live repo files win for exact execution.
- Bundled references win for initial routing and planning.
- Upstream docs/context packs are used only for exceptional code generation
or library API details.
Before You Begin
- Read this workflow and the relevant bundled reference before
opening repo source files.
- Route from and before any
broad repo exploration. Once a route is determined, verify only the selected
live step/config/env files needed for the answer.
- Do not emit commands with fake paths, placeholder model IDs, guessed task IDs,
guessed batch profiles, or default auth variable names presented as facts.
Ask for missing concrete values or return a handoff.
- Use as the authoritative checklist before
finalizing configs or execution commands.
- For pipeline requests, plan before editing. Do not create or modify files
until the DAG, artifact edges, required inputs, and validation checks are
stated and approved.
- For one-shot command requests, prefer a complete parameterized command in one
response over exploratory prose, but only after required inputs are known.
If the user already provides the needed values and asks for only a command,
answer with the command first and keep explanation minimal.
- Output discipline (keeps responses tight): emit one command block per step,
include only flags the step actually defines, and add no speculative or
invented flags. Keep narrative to a few lines — the command plus the required
safety/profile callouts, not a tutorial. Do not restate reference content the
user did not ask for.
- Do not spawn subagents for one-shot command lookup. Use the bundled command
reference directly; verify only the selected step if needed.
Safety
Keep Bash scoped to repo-safe commands such as
uv run nemotron steps ...
,
targeted tests,
, and config validation. Never run environment
dumps (
,
, broad
) or commands that expose secret values.
For remote submissions, destructive changes, or expensive launches, confirm
before execution.
When inspecting env/config files, avoid printing whole files that may contain
secrets. Use targeted reads, report only section names and env-var names, and
redact values for fields containing
,
,
,
,
, or
.
Reference Map
| Question | Read first | Live fallback / verification |
|---|
| Which step or category fits? | | uv run nemotron steps list/show
, then selected |
| Do artifacts chain? | | src/nemotron/steps/types.toml
|
| What run shape should I emit? | | checked-in config YAML plus active profile TOML |
| Remote profile generation or selection | | active , , or |
| What hardware/backend should I recommend? | | selected step and |
| Which cross-step guardrails apply? | | src/nemotron/steps/patterns/<id>.md
|
| How do I run the full workflow? | | selected step configs, , and runners |
| Which upstream library API should generated code use? | references/context/index.toml
-> matching pack | selected , , upstream docs |
| New project scaffold, only when existing repo code cannot support the request | references/act/PROJECT.md
| existing repo project/recipe shape |
| Per-stage code rules, only when existing repo code cannot support the request | | selected and shared runner |
Do not start by reading category READMEs or
for ordinary decisions.
Select candidates from bundled references, then verify exact live details before
writing configs or final commands.
Routing
Use
as the authoritative home for step selection and
route-specific fast paths. Use
,
, and
only to resolve artifact, cross-step, or hardware constraints after the catalog
narrows the route.
Each step is independent and stitching steps together is your job. Compose any
pipeline by artifact matching from the user's end goal: chain a step only when
the next step consumes an artifact type nothing upstream already produces. Do
not rely on fixed, named step combinations.
Instructions
Follow the flow that matches the request: a recommendation/plan, a single-step
command, or a multi-step pipeline. In all cases, route from the bundled
references first, gather required inputs, and verify the selected live step
before presenting anything as runnable.
Recommendation Response
Use this shape for planning answers:
,
,
,
,
, and
.
Call out the stack to avoid when the user's constraints make it a poor fit.
Whenever the answer includes a command that touches a hosted service or remote
execution, also state, in the answer:
- The auth env-var name and that its value must be exported in the environment,
never inlined or committed (never print the value).
- For /, the env TOML profile prerequisite; if no profile
exists, mark the command or give the local shape.
Single-Step Command Flow
- Confirm repo root has and .
- Read and the selected section of
.
- Verify the selected live step with
uv run nemotron steps show <step_id>
when available, or the selected when the CLI is unavailable.
- Read the requested checked-in config or user overlay before emitting the
command.
- For remote execution, read or repo-root and
pick an actual section whose profile matches the step.
- Emit the full command in one reply with the source tier:
, , , or .
Canonical command shapes live in
.
Pipeline Workflow
For pipelines with two or more stages, use
Orient -> Plan -> Act -> Verify.
Read
for the phase checklist.
- Orient from bundled references and user constraints.
- Plan a DAG with artifact types, configs, patterns, and validation checks.
- Wait for approval before writing configs or code.
- Act with YAML/config-only changes whenever an existing step can satisfy the
request.
- Verify every generated YAML, artifact edge, command, and README command
before reporting completion.
Catalog Mode
Use when the request maps to existing steps. Fast path:
->
->
-> verify selected live manifest/config/profile ->
add a new named config under the selected step's
directory.
Customization Surface
- Always customize through the step catalog under . Never
divert to alternate recipe CLIs such as
src/nemotron/cli/commands/super3/
or
, even for Super3/Nano3 work. If a request seems to need those,
map it back to the equivalent catalog step (e.g. ).
- Make customizations as NEW config files inside the selected step's
src/nemotron/steps/<cat>/<step>/config/
directory, for example
src/nemotron/steps/sft/megatron_bridge/config/my_super3.yaml
.
- Never edit the checked-in , , other shipped configs,
, , or shared runners. Adding a new config file beside
them is the expected and only customization write.
- Base new configs on the checked-in schema (read it, copy the
needed fields), then override only what the request requires.
Explorer Mode
Use only after confirming no existing step, runner, recipe, CLI, or YAML config
surface can satisfy the request. Full procedure lives in
.
Configuration Alignment
Surface these constraints before commands or config writes:
- SFT packing , Megatron-Bridge , packed sequence size,
tokenizer, and chat template must match.
- Prepared and are tokenizer-locked; rebuild after
tokenizer, chat-template, sequence-length, split, or blend changes.
- Megatron-Bridge global batch size must be divisible by data-parallel size;
start distributed validation with micro batch size 1.
- TP/PP/CP/EP choices must fit GPU count, memory, topology, and model divisibility.
- LoRA merge requires the exact base checkpoint/model and tokenizer used during
adapter training.
- Conversion/eval of Megatron checkpoints should point at a concrete
checkpoint, not a parent run directory.
- Hosted eval and translation configs store auth env-var names only, not values.
Operational Nuances
- Smoke configs (, ) are wiring tests, not quality
evidence.
- references belong in recipe-backed configs; standalone YAML uses
plain paths.
- Keep pretraining data and from the same run/release.
- Write customized configs as new files in the step's
src/nemotron/steps/<cat>/<step>/config/
directory; never modify the
checked-in or other shipped configs.
- For LoRA, preserve the exact base checkpoint and tokenizer/template metadata
needed by later merge/eval.
- For translation and hosted eval, mention auth environment variable names only,
never values.
Boundaries
Do:
- Always route through the step catalog under ; never use
alternate recipe CLIs (
src/nemotron/cli/commands/super3|nano3/...
).
- Reuse repo CLIs, runners, recipes, steps, and checked-in configs first.
- Customize by adding a new config under the step's directory; base it
on rather than copying it blindly.
- Validate artifact edges and cite patterns that changed the plan.
- Ask about hardware/data/backend/output path when missing.
- Surface tradeoffs such as AutoModel vs Megatron-Bridge and full SFT vs LoRA.
Do not:
- Invent steps when a catalog step fits.
- Skip Plan for pipelines with two or more stages.
- Generate Python or shell when YAML is enough.
- Add monitoring/W&B unless asked.
- Assume GPU count, env profile, endpoint type, task ID, or auth value.
- Generate Slurm/Airflow/Kubeflow wrappers unless the request explicitly needs
deployment scaffolding.
- Edit checked-in step files (/, other shipped configs,
, , runners); only add a new config beside them.
- Restate all per-step rules in ; use bundled references and source
fallback.
Examples
Single-step routing (LoRA on a small box). User: "LoRA fine-tune a HF model
on 2 GPUs." Route per
->
(HF base + small GPU
count); do not offer Megatron-Bridge. Collect base model, JSONL data path,
output dir, LoRA rank/alpha, then emit one
uv run nemotron steps run peft/automodel -c <config> --dry-run ...
command.
Multi-step pipeline (Super3 SFT). User: "data prep + SFT for Super3." This is
two stages, so plan first: SFT on Super3 -> Megatron-Bridge, which consumes
, so
is required upstream. Present the
DAG (
sft_packing -> sft/megatron_bridge
), align
/
/
tokenizer, wait for approval, then add new configs under
src/nemotron/steps/<step>/config/<name>.yaml
. Super3 needs a remote profile;
state the env TOML prerequisite or mark
.
Hosted-endpoint evaluation (no training). User: "benchmark my hosted model
endpoint." Route to
with
. Collect endpoint URL,
model id, task IDs, and the auth env-var name (value exported, never inlined).
See
Evaluation Examples.
Troubleshooting
| Situation | Action |
|---|
| Artifact types do not chain | Recheck ; insert a converter or change the DAG before writing configs. |
| Remote profile or is unclear | Read active env TOML; do not guess profile names. |
| Config key is unclear | Verify selected checked-in config, , and shared runner before editing. |
| Strategy points to a missing context pack | Skip the pack, use catalog/pattern text, and flag the plan with WARNING: <topic> docs unavailable
. |
| Hardware looks too small | Use ; suggest smaller model, AutoModel, then LoRA before full Megatron-Bridge. |
| Two Act attempts fail | Stop, explain what was tried and failed, and ask how to proceed. |
| No existing repo path matches | Check references/context/index.toml
and selected source fallback; use Explorer mode only after naming the gap. |