ad-conf-check
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAutoDeploy Config Checker
AutoDeploy配置检查器
Verify that AutoDeploy YAML configs were applied at runtime by cross-referencing with server logs and optionally graph dumps.
通过与服务器日志及可选的图转储交叉对比,验证AutoDeploy YAML配置是否在运行时已应用。
Input
输入
- TensorRT-LLM source directory (required) — path to the TensorRT-LLM repo root. Used to read the latest and source code for up-to-date log patterns (the bundled reference doc may be stale).
default.yaml - YAML config file path(s) (required) — one or more AutoDeploy YAML configs. When multiple files are provided, they are deep-merged left-to-right (later files override earlier ones for overlapping keys).
- Server log file path (required) — log output from the AutoDeploy server run.
- Graph dump directory (optional) — output directory containing per-transform graph snapshots (
AD_DUMP_GRAPHS_DIR). Provides additional evidence for resolving UNKNOWN results.NNN_stage_transform.txt - Nsys trace file (optional) — Nsight Systems profile (or
.nsys-rep) from the server run. Useful for verifying executor-level configs that produce no log output (e.g.,.sqlite, multi-stream concurrency, CUDA graph capture/replay).enable_chunked_prefill - Table output file path (optional) — path to write human-friendly table results.
- JSON output file path (optional) — path to write machine-friendly JSON results.
- TensorRT-LLM源码目录(必填)——TensorRT-LLM仓库根目录路径。用于读取最新的和源码以获取最新日志模式(捆绑的参考文档可能过时)。
default.yaml - YAML配置文件路径(必填)——一个或多个AutoDeploy YAML配置文件。提供多个文件时,会从左到右深度合并(后续文件会覆盖前置文件的重叠键)。
- 服务器日志文件路径(必填)——AutoDeploy服务器运行生成的日志输出。
- 图转储目录(可选)——输出目录,包含每个转换的图快照文件(
AD_DUMP_GRAPHS_DIR)。可为解决UNKNOWN结果提供额外证据。NNN_stage_transform.txt - Nsys追踪文件(可选)——服务器运行生成的Nsight Systems分析文件(或
.nsys-rep格式)。可用于验证无日志输出的执行器级配置(如.sqlite、多流并发、CUDA图捕获/重放)。enable_chunked_prefill - 表格输出文件路径(可选)——用于写入易读表格结果的路径。
- JSON输出文件路径(可选)——用于写入机器友好型JSON结果的路径。
Output
输出
Human-friendly table (always presented to user)
易读表格(始终向用户展示)
- Verification table — one row per config key with columns: Config (key=value), Result (APPLIED / FAILED / SKIPPED / DISABLED / UNKNOWN), Evidence (log line or graph analysis proving the result).
- Summary line — total counts per status (e.g., ).
Total configs checked: 29 | APPLIED: 23 | UNKNOWN: 4 | ... - FAILED/WARNING details — expanded information for any configs that failed or had warnings.
- 验证表格——每个配置键对应一行,包含列:配置(key=value)、结果(APPLIED / FAILED / SKIPPED / DISABLED / UNKNOWN)、证据(证明结果的日志行或图分析内容)。
- 汇总行——各状态的总计数(例如:)。
已检查配置总数: 29 | APPLIED: 23 | UNKNOWN: 4 | ... - FAILED/WARNING详情——对任何配置失败或存在警告的展开信息。
Machine-friendly JSON (when JSON output path is given)
机器友好型JSON(指定JSON输出路径时生成)
JSON file with two top-level keys:
- — array of objects, each with
results,config,value,status.evidence - — object with
summary(int) andtotal(object mapping status to count, only non-zero statuses included).counts
包含两个顶级键的JSON文件:
- ——对象数组,每个对象包含
results、config、value、status字段。evidence - ——对象,包含
summary(整数)和total(对象,映射状态到计数,仅包含非零状态)。counts
Workflow
工作流程
-
[Collect Inputs] Ask the user for the following inputs:
- TensorRT-LLM source directory (required) — path to the TensorRT-LLM repo root. Used to cross-check and source code for the latest log patterns.
default.yaml - YAML config file path(s) (required) — one or more AutoDeploy configs used for the run. When multiple YAMLs are provided, they are deep-merged left-to-right: later files override earlier ones for overlapping keys. Tell the user: "If you have multiple configs (e.g., a default config and a user override), list them in priority order — lowest priority first, highest priority last."
- Server log file path (required) — the log output from the server
- Graph dump directory (optional but recommended) — the output directory containing per-transform graph snapshots. Files are named
AD_DUMP_GRAPHS_DIRand show the graph AFTER each transform. When provided, graph analysis provides additional evidence (e.g., verifying sharded weights, collective ops, fused ops). This is especially useful for resolving UNKNOWN results.NNN_stage_transform.txt - Nsys trace file (optional) — Nsight Systems profile (or
.nsys-rep) from the server run. Useful for verifying executor-level configs that produce no log output (e.g.,.sqlite, multi-stream concurrency, CUDA graph capture/replay).enable_chunked_prefill - TensorRT-LLM source reference paths:
- Example configs:
<trtllm_src>/examples/auto_deploy/model_registry/configs/*.yaml - Default transform config (all available transforms and their defaults):
<trtllm_src>/tensorrt_llm/_torch/auto_deploy/config/default.yaml
- Example configs:
- TensorRT-LLM source directory (required) — path to the TensorRT-LLM repo root. Used to cross-check
-
[Update Reference Doc] Before checking configs, ensure the bundled reference doc is up-to-date with the TensorRT-LLM source.Launch theagent with:
ad-conf-check-update- — the TensorRT-LLM source directory from step 1
<trtllm_src> - — the directory containing this SKILL.md file
<skill_dir>
The agent comparesand the AutoDeploy source code against<trtllm_src>/tensorrt_llm/_torch/auto_deploy/config/default.yaml. If any configs were added, removed, renamed, or if log patterns have changed, the agent updates the reference doc in-place and reports what changed.<skill_dir>/references/config_log_patterns.mdAfter the agent completes:- If the reference doc was updated, inform the user: "Updated references/config_log_patterns.md to match the latest TensorRT-LLM source — see the agent's change summary below." Then show the agent's summary.
- If no changes were needed, briefly note: "Reference doc is up-to-date with the TensorRT-LLM source."
-
[Parse Configs] Run the parser script to flatten the YAML configs (is the directory containing this SKILL.md file):
<skill_dir>Input: The TensorRT-LLMas the base, followed by the user's YAML config path(s) from step 1. Always includedefault.yamlfirst so that user configs override the defaults.default.yamlbashpython3 <skill_dir>/scripts/parse_config.py <trtllm_src>/tensorrt_llm/_torch/auto_deploy/config/default.yaml <yaml_path1> [<yaml_path2> ...]This deep-merges the YAML files left-to-right (later files override earlier ones) and flattens nested keys into dotted notation (e.g.,). By includingkv_cache_config.enable_block_reusefirst, every known config key appears in the output even if the user only overrode a subset.default.yamlOutput: Flat JSON with all configpairs. Example:{key, value}json{ "yaml_files": ["default.yaml", "user_override.yaml"], "total_configs": 15, "configs": [ {"key": "compile_backend", "value": "torch-cudagraph"}, {"key": "kv_cache_config.free_gpu_memory_fraction", "value": "0.85"}, {"key": "transforms.compile_model.piecewise_enabled", "value": "True"} ] } -
[Quick Scan] Check each config against the server log using parallel agents.Input: Config list from step 3, server log path from step 1, and references/config_log_patterns.md.Split the configs from step 3 into 3 groups by section and launch 3 agents in parallel, each checking its group:
Agent Config group Keys starting with Reference section Agent 1 Top-level configs ,runtime,compile_backend,attn_backend,max_seq_len,max_num_tokens,max_batch_size,cuda_graph_batch_sizes,enable_chunked_prefill,model_factory, etc.dtype"Top-Level Config Parameters" Agent 2 KV cache configs kv_cache_config.*"kv_cache_config Parameters" Agent 3 Transform configs (or any key matching a transform name liketransforms.*,compile_model,detect_sharding,multi_stream_*,fuse_*, etc.)gather_logits_*"Transform Parameters" Each agent receives:- Its subset of pairs
{key, value} - The server log file path
- The reference doc references/config_log_patterns.md (including verification source tags: ,
[log],[graph])[nsys] - The nsys trace file path (if provided)
Each agent, for every config in its group:- Reads the reference doc to find the relevant keywords and patterns for this config key.
- Greps the server log for those patterns. Key search strategies:
- For transform configs: grep for and check the
[stage=..., transform=<name>]line ([SUMMARY]→ APPLIED if N>0, SKIPPED if N=0).matches=N - For configs with success/failure indicators: grep for those specific strings.
- For configs with no known log pattern: grep for or the key name near the value.
key=value - For configs with : mark as DISABLED without log search.
enabled: false
- For transform configs: grep for
- Assigns a status based on what was found:
- APPLIED — log confirms the config took effect
- FAILED — log shows the config was attempted but fell back or errored
- SKIPPED — transform ran but found nothing to do (0 matches)
- DISABLED — config explicitly set
enabled: false - UNKNOWN — no log evidence found (config may still be active but unlogged)
- Records the evidence (the matching log line or lack thereof).
Output: Each agent returns a list ofentries for its group. Merge all 3 lists into the combined result.{config, value, status, evidence} - Its subset of
-
[Double Check] For any UNKNOWN entries from step 4, investigate further before presenting results to the user (FAILED entries already have concrete log evidence and do not need double-checking):Input: List of UNKNOWN config entries from step 4 output, the server log file, and references/config_log_patterns.md.
- Re-read references/config_log_patterns.md for alternative patterns
- Grep the log more broadly for the transform name:
[stage=..., transform=<name>] - Look for prefixed lines and
[APPLY]lines for that transform[SUMMARY] - Check for ,
"Falling back", or"Skipping"near the transform logs"failed" - If graph dump directory was provided:
- Graph files are named — each contains the FX graph AFTER that transform. Compare before/after by reading consecutive files.
NNN_stage_transform.txt - Graph evidence can upgrade UNKNOWN to APPLIED (e.g., collective ops after lm_head confirm sharding, fused custom ops confirm fusion transforms).
- Graph analysis verifies: sharding (collective ops, weight shape changes), attention backend (op types), MoE fusion (fused op presence), GEMM fusion (linear op count changes), RMSNorm/SwiGLU/RoPE pattern matching (custom op presence).
- See references/graph_verification_patterns.md for the full list of graph-based checks.
- Graph files are named
- If nsys trace was provided, check for executor-level configs tagged in the reference doc (e.g.,
[nsys],enable_chunked_prefill, multi-stream concurrency, CUDA graph capture/replay)enable_block_reuse
Output: For each investigated UNKNOWN entry, either additional evidence found (with status upgrade) or confirmation that the config is genuinely unlogged. -
[Report] Present the final results to the user.ALWAYS show the full detailed table. Do NOT summarize or condense. Present one row per config with columns:
- Config — the config key and its value (e.g., )
compile_backend = torch-cudagraph - Result — one of: APPLIED, FAILED, SKIPPED, DISABLED, UNKNOWN
- Evidence — the log line or pattern that proves the result
After the table, show the summary line (e.g.,) and any FAILED/WARNING details. Include any additional findings from the Double Check step (step 5).Total configs checked: 29 | APPLIED: 23 | ...If the user requested output files, write:- Table output — the human-friendly table as plain text
- JSON output — machine-friendly JSON with array and
resultsobjectsummary
- Config — the config key and its value (e.g.,
-
[收集输入] 向用户请求以下输入:
- TensorRT-LLM源码目录(必填)——TensorRT-LLM仓库根目录路径。用于交叉核对和源码以获取最新日志模式。
default.yaml - YAML配置文件路径(必填)——运行时使用的一个或多个AutoDeploy配置文件。提供多个YAML文件时,会从左到右深度合并:后续文件覆盖前置文件的重叠键。告知用户:"如果您有多个配置文件(例如默认配置和用户自定义覆盖配置),请按优先级顺序列出——优先级最低的在前,最高的在后。"
- 服务器日志文件路径(必填)——服务器生成的日志输出
- 图转储目录(可选但推荐)——输出目录,包含每个转换的图快照文件。文件命名为
AD_DUMP_GRAPHS_DIR,展示每个转换后的图。提供该目录时,图分析可提供额外证据(例如验证分片权重、集合操作、融合操作),这对解决UNKNOWN结果尤为有用。NNN_stage_transform.txt - Nsys追踪文件(可选)——服务器运行生成的Nsight Systems分析文件(或
.nsys-rep格式)。可用于验证无日志输出的执行器级配置(如.sqlite、多流并发、CUDA图捕获/重放)。enable_chunked_prefill - TensorRT-LLM源码参考路径:
- 示例配置:
<trtllm_src>/examples/auto_deploy/model_registry/configs/*.yaml - 默认转换配置(所有可用转换及其默认值):
<trtllm_src>/tensorrt_llm/_torch/auto_deploy/config/default.yaml
- 示例配置:
- TensorRT-LLM源码目录(必填)——TensorRT-LLM仓库根目录路径。用于交叉核对
-
[更新参考文档] 在检查配置前,确保捆绑的参考文档与TensorRT-LLM源码保持同步。启动代理,传入:
ad-conf-check-update- ——步骤1中获取的TensorRT-LLM源码目录
<trtllm_src> - ——包含此SKILL.md文件的目录
<skill_dir>
该代理会对比和AutoDeploy源码与<trtllm_src>/tensorrt_llm/_torch/auto_deploy/config/default.yaml的差异。如果有配置被添加、删除、重命名,或日志模式发生变化,代理会就地更新参考文档并报告变更内容。<skill_dir>/references/config_log_patterns.md代理完成后:- 如果参考文档已更新,告知用户:"已更新references/config_log_patterns.md以匹配最新的TensorRT-LLM源码——请查看下面代理的变更摘要。" 然后展示代理的摘要内容。
- 如果无需变更,简要说明:"参考文档与TensorRT-LLM源码保持同步。"
-
[解析配置] 运行解析脚本以扁平化YAML配置文件(是包含此SKILL.md文件的目录):
<skill_dir>输入: 以TensorRT-LLM的为基础,后跟步骤1中用户提供的YAML配置文件路径。始终先包含default.yaml,以便用户配置覆盖默认值。default.yamlbashpython3 <skill_dir>/scripts/parse_config.py <trtllm_src>/tensorrt_llm/_torch/auto_deploy/config/default.yaml <yaml_path1> [<yaml_path2> ...]该脚本会从左到右深度合并YAML文件(后续文件覆盖前置文件),并将嵌套键扁平化为点分隔格式(例如)。通过先包含kv_cache_config.enable_block_reuse,即使用户仅覆盖了部分配置,输出中也会包含所有已知的配置键。default.yaml输出: 包含所有配置对的扁平JSON。示例:{key, value}json{ "yaml_files": ["default.yaml", "user_override.yaml"], "total_configs": 15, "configs": [ {"key": "compile_backend", "value": "torch-cudagraph"}, {"key": "kv_cache_config.free_gpu_memory_fraction", "value": "0.85"}, {"key": "transforms.compile_model.piecewise_enabled", "value": "True"} ] } -
[快速扫描] 使用并行代理检查每个配置与服务器日志的匹配情况。输入: 步骤3生成的配置列表、步骤1提供的服务器日志路径,以及references/config_log_patterns.md。将步骤3的配置按章节分为3组,并行启动3个代理,每个代理检查其对应的组:
代理 配置组 键前缀 参考章节 代理1 顶级配置 ,runtime,compile_backend,attn_backend,max_seq_len,max_num_tokens,max_batch_size,cuda_graph_batch_sizes,enable_chunked_prefill,model_factory等dtype"Top-Level Config Parameters" 代理2 KV缓存配置 kv_cache_config.*"kv_cache_config Parameters" 代理3 转换配置 (或任何匹配转换名称的键,如transforms.*,compile_model,detect_sharding,multi_stream_*,fuse_*等)gather_logits_*"Transform Parameters" 每个代理会收到:- 其对应的子集
{key, value} - 服务器日志文件路径
- 参考文档references/config_log_patterns.md(包含验证源标签:,
[log],[graph])[nsys] - Nsys追踪文件路径(如果提供)
每个代理会对其组内的每个配置执行以下操作:- 读取参考文档,找到与此配置键相关的关键词和模式。
- 在服务器日志中搜索这些模式。核心搜索策略:
- 对于转换配置:搜索并检查
[stage=..., transform=<name>]行([SUMMARY]→ 若N>0则标记为APPLIED,若N=0则标记为SKIPPED)。matches=N - 对于带有成功/失败标识的配置:搜索特定字符串。
- 对于无已知日志模式的配置:搜索或键名附近的值。
key=value - 对于设置的配置:无需搜索日志,直接标记为DISABLED。
enabled: false
- 对于转换配置:搜索
- 根据搜索结果分配状态:
- APPLIED —— 日志确认配置已生效
- FAILED —— 日志显示配置已尝试但回退或出错
- SKIPPED —— 转换已运行但未找到可处理内容(0匹配)
- DISABLED —— 配置显式设置为
enabled: false - UNKNOWN —— 未找到日志证据(配置可能仍在生效但未记录日志)
- 记录证据(匹配的日志行或无匹配的说明)。
输出: 每个代理返回其组内的条目列表。将3个列表合并为组合结果。{config, value, status, evidence} - 其对应的
-
[二次检查] 针对步骤4中任何UNKNOWN条目,在向用户展示结果前进行进一步调查(FAILED条目已有明确日志证据,无需二次检查):输入: 步骤4输出中的UNKNOWN配置条目列表、服务器日志文件,以及references/config_log_patterns.md。
- 重新读取references/config_log_patterns.md以查找替代模式
- 在日志中更广泛地搜索转换名称:
[stage=..., transform=<name>] - 查找带有前缀的行和该转换的
[APPLY]行[SUMMARY] - 检查转换日志附近是否有、
"Falling back"或"Skipping"字样"failed" - 如果提供了图转储目录:
- 图文件命名为——每个文件包含该转换后的FX图。通过读取连续文件对比转换前后的差异。
NNN_stage_transform.txt - 图证据可将UNKNOWN升级为APPLIED(例如lm_head后的集合操作确认分片,融合自定义操作确认融合转换)。
- 图分析可验证:分片(集合操作、权重形状变化)、注意力后端(操作类型)、MoE融合(融合操作存在)、GEMM融合(线性操作数量变化)、RMSNorm/SwiGLU/RoPE模式匹配(自定义操作存在)。
- 完整的基于图的检查列表请参考references/graph_verification_patterns.md。
- 图文件命名为
- 如果提供了Nsys追踪文件,检查参考文档中标记为的执行器级配置(如
[nsys]、enable_chunked_prefill、多流并发、CUDA图捕获/重放)enable_block_reuse
输出: 对于每个被调查的UNKNOWN条目,要么找到额外证据(并升级状态),要么确认该配置确实无日志记录。 -
[报告] 向用户展示最终结果。始终展示完整的详细表格,请勿总结或压缩。每行展示一个配置,包含列:
- 配置——配置键及其值(例如)
compile_backend = torch-cudagraph - 结果——以下之一:APPLIED、FAILED、SKIPPED、DISABLED、UNKNOWN
- 证据——证明结果的日志行或模式
在表格之后,展示汇总行(例如)以及任何FAILED/WARNING详情。包含二次检查步骤(步骤5)中的所有额外发现。已检查配置总数: 29 | APPLIED: 23 | ...如果用户请求输出文件,写入:- 表格输出——纯文本格式的易读表格
- JSON输出——包含数组和
results对象的机器友好型JSONsummary
- 配置——配置键及其值(例如
Key Patterns to Know
核心模式须知
- Every transform logs:
[stage=<stage>, transform=<name>] [SUMMARY] matches=N | time: ... - Piecewise success chain: ->
dual-mode enabled->prepared with N submodulescaptured graphs - Piecewise failure:
"model is not a GraphModule...Falling back to eager execution" - Sharding: ,
"Using allreduce strategy: SYMM_MEM""Applied N TP shards from config"
- 每个转换都会记录:
[stage=<stage>, transform=<name>] [SUMMARY] matches=N | time: ... - 分段成功链:->
dual-mode enabled->prepared with N submodulescaptured graphs - 分段失败:
"model is not a GraphModule...Falling back to eager execution" - 分片:,
"Using allreduce strategy: SYMM_MEM""Applied N TP shards from config"
Gotchas
注意事项
- Every YAML key must appear in the output. Check all configs from the YAML, not just ones with known patterns. If a config key has no entry in the reference doc, grep the log for the key name and value. New/unknown configs should still be reported — never silently skip them.
- UNKNOWN does not mean the config was ignored. Some configs (e.g., ,
enable_chunked_prefill) are consumed at executor/runtime level and produce no log output. UNKNOWN means "no log evidence found", not "config was not applied".enable_block_reuse - Deprecated config names may cause FAILED. For example, is deprecated in favor of
torch_dtype, anddtype(top-level) is replaced bycuda_graph_batch_sizes. Look for deprecation warning messages in the log. Old keys may be silently ignored.cuda_graph_config.batch_sizes - Runtime may adjust configured values. For example, may be configured as 262144 but adjusted down to 16384 at runtime due to memory constraints. Report this as APPLIED with a WARNING annotation.
max_seq_len - ANSI color codes in logs. AutoDeploy uses colored log output. Strip or ignore ANSI escape sequences when matching patterns.
- Reference doc is auto-updated. Step 2 runs the agent to sync references/config_log_patterns.md with the latest TensorRT-LLM source before any config checking begins. If the agent reports changes, review its summary to understand what shifted.
ad-conf-check-update
- 每个YAML键都必须出现在输出中。检查YAML中的所有配置,而不仅仅是有已知模式的配置。如果某个配置键在参考文档中没有条目,在日志中搜索键名和值。新的/未知的配置仍需报告——切勿静默跳过。
- UNKNOWN并不意味着配置被忽略。某些配置(如、
enable_chunked_prefill)在执行器/运行时级别被消费,不会产生日志输出。UNKNOWN表示“未找到日志证据”,而非“配置未应用”。enable_block_reuse - 已弃用的配置名称可能导致FAILED。例如已被
torch_dtype取代,顶级dtype已被cuda_graph_batch_sizes取代。请在日志中查找弃用警告信息。旧键可能被静默忽略。cuda_graph_config.batch_sizes - 运行时可能调整配置值。例如可能被配置为262144,但由于内存限制,运行时会下调至16384。此情况应报告为APPLIED并添加WARNING注释。
max_seq_len - 日志中的ANSI颜色代码。AutoDeploy使用带颜色的日志输出。匹配模式时请剥离或忽略ANSI转义序列。
- 参考文档会自动更新。步骤2会运行代理,在开始任何配置检查前同步references/config_log_patterns.md与最新的TensorRT-LLM源码。如果代理报告有变更,请查看其摘要以了解具体变化。",
ad-conf-check-update