tao-run-deft-aoi

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Skill: tao-run-deft-aoi

技能:tao-run-deft-aoi

When to Use This Skill

适用场景

Use this skill when the user wants an agent to run the full DEFT AOI improvement loop for an NVIDIA TAO VisualChangeNet / ChangeNet PCB inspection model: baseline evaluation, RCA, ingestion of pre-generated synthetic defects, data mining, retraining, and deployment gating until a KPI target is met. AnomalyGen is not run inline in this EA variant — the customer pre-generates NG/OK pairs out-of-band and places them under
<workspace>/augmentation/anomalygen/
.
  • "Run the DEFT loop"
  • "Fine-tune until FAR < 0.1% at recall=100%"
  • "Improve my AOI ChangeNet model using RCA and synthetic defects"
  • "Iterate training until false accept rate meets the target"
Do not use this skill for a single standalone TAO training run, one-off inference, generic anomaly generation, or RCA-only analysis. Use the relevant agent directly when the user asks for only that step.
当用户希望Agent为NVIDIA TAO VisualChangeNet / ChangeNet PCB检测模型运行完整的DEFT AOI改进循环时,使用本技能:包括基线评估、RCA、导入预生成合成缺陷、数据挖掘、重训练,以及部署准入,直到达成KPI目标。本EA变体在线运行AnomalyGen——客户需离线预生成NG/OK图像对,并将其放置在
<workspace>/augmentation/anomalygen/
路径下。
  • "运行DEFT循环"
  • "微调至召回率=100%时FAR < 0.1%"
  • "通过RCA和合成缺陷优化我的AOI ChangeNet模型"
  • "迭代训练直到误接受率达到目标"
请勿将本技能用于单次独立TAO训练、一次性推理、通用异常生成或仅RCA分析场景。当用户仅要求执行某一步骤时,请直接调用相关Agent。

Base Model

基础模型

The loop operates on NVIDIA TAO Visual ChangeNet classify with the NVIDIA C-RADIOv2-B backbone, fine-tuned end-to-end. The architecture is defined in
specs/baseline_spec.yaml
— that file is the source of truth. All pretrained weights come from HuggingFace (
HF_TOKEN
required);
NGC_API_KEY_*
only gate container pulls. ChangeNet backbone resolution + the staged-file/HF-URL fallback for
model.backbone.pretrained_backbone_path
are owned by
references/visual-changenet.md
. SigLIP for k-NN mining is owned by
references/tao-mine-aoi-images.md
. No AnomalyGen-side checkpoints are required in this EA variant — pre-generated synthetic pairs are ingested directly from
<workspace>/augmentation/anomalygen/{reconstructed_image,original_image}/
; see Pipeline step 3 in
references/pipeline.md
.
该循环基于NVIDIA TAO Visual ChangeNet分类模型,采用NVIDIA C-RADIOv2-B骨干网络,进行端到端微调。模型架构定义在
specs/baseline_spec.yaml
文件中——该文件为权威来源。所有预训练权重均来自HuggingFace(需
HF_TOKEN
);
NGC_API_KEY_*
仅用于控制容器拉取权限。ChangeNet骨干网络分辨率以及
model.backbone.pretrained_backbone_path
的分层文件/HF-URL回退机制由
references/visual-changenet.md
定义。用于k-NN挖掘的SigLIP由
references/tao-mine-aoi-images.md
定义。本EA变体无需AnomalyGen侧的检查点——预生成的合成图像对直接从
<workspace>/augmentation/anomalygen/{reconstructed_image,original_image}/
导入;详见
references/pipeline.md
中的流程步骤3。

Train AutoML Policy

训练AutoML策略

DEFT AOI owns the iterative data-improvement loop, retraining cadence, and KPI checkpoint selection. For this workflow only, bypass model-level AutoML even when the underlying Visual ChangeNet model metadata has
automl_enabled: true
. Invoke every Visual ChangeNet train stage, including baseline and iteration retrain, with the run override
automl_policy: off
/ plain training. This is a workflow-level override only; do not change model metadata, and do not apply this policy to other workflows.
DEFT AOI负责迭代式数据改进循环、重训练节奏以及KPI检查点选择。仅针对本工作流,即使底层Visual ChangeNet模型元数据中
automl_enabled: true
,也要绕过模型级AutoML。调用所有Visual ChangeNet训练阶段(包括基线训练和迭代重训练)时,需使用运行覆盖参数
automl_policy: off
/ 普通训练模式。这仅为工作流级别的覆盖设置;请勿修改模型元数据,也不要将此策略应用于其他工作流。

Launch Intake

启动引导

After the user confirms they want to run this workflow, ask which supported platform they intend to run on. Generate the platform choices with:
bash
${TAO_SKILL_BANK_PATH:-~/tao-skills-external}/scripts/list_tao_platforms.py \
  --skill-bank ${TAO_SKILL_BANK_PATH:-~/tao-skills-external} --format text
After platform selection, run:
bash
${TAO_SKILL_BANK_PATH:-~/tao-skills-external}/scripts/list_tao_platforms.py \
  --skill-bank ${TAO_SKILL_BANK_PATH:-~/tao-skills-external} \
  --platform <platform> --format text
Ask only for credentials relevant to that platform, plus model-specific credentials required by the selected workflow.
在用户确认要运行本工作流后,询问其计划运行的支持平台。通过以下命令生成平台选项:
bash
${TAO_SKILL_BANK_PATH:-~/tao-skills-external}/scripts/list_tao_platforms.py \
  --skill-bank ${TAO_SKILL_BANK_PATH:-~/tao-skills-external} --format text
用户选择平台后,运行:
bash
${TAO_SKILL_BANK_PATH:-~/tao-skills-external}/scripts/list_tao_platforms.py \
  --skill-bank ${TAO_SKILL_BANK_PATH:-~/tao-skills-external} \
  --platform <platform> --format text
仅询问与该平台相关的凭证,以及所选工作流所需的模型特定凭证。

Agent Behavior

Agent行为

There is exactly one user gate: pre-flight confirmation. Print the Pre-Flight Summary (see Pre-Flight Summary in
references/pre-flight.md
), then STOP and wait for the user to type "go", "yes", "looks good", or similar explicit approval. Do not launch any side-effecting step (
docker run
, training, SDG, mutations under
${RESULTS_DIR}/
) before that approval — reading specs, listing files,
docker image inspect
, and populating the summary table are fine. "Autonomous" describes behavior after this gate, not before it. Do not skip the gate even if the user's original prompt sounded urgent ("just run it", "go ahead") — the summary itself is the artifact they need to see before approving.
After the gate, the skill is fully autonomous. Run the entire loop without asking for confirmation. Do not pause between steps. Do not ask "want me to continue?" — just continue. Only stop if a step fails with an unrecoverable error or a hard-stop gate fires. Print a one-line status update at each step milestone so the user can follow progress.
仅存在一个用户确认环节:飞行前确认。 打印飞行前摘要(详见
references/pre-flight.md
中的「飞行前摘要」部分),然后停止操作并等待用户输入“go”、“yes”、“looks good”或类似明确批准指令。在获得批准前,请勿启动任何会产生副作用的步骤(如
docker run
、训练、SDG、修改
${RESULTS_DIR}/
下的内容)——读取配置文件、列出文件、
docker image inspect
以及填充摘要表格是允许的。“自主”指的是获得该确认后的行为,而非之前。 即使用户最初的指令听起来很紧急(如“直接运行”、“开始吧”),也不要跳过该确认环节——用户需要先查看摘要才能批准。
获得确认后,本技能将完全自主运行。 无需再次确认即可运行整个循环。步骤之间无需暂停。不要询问“是否要继续?”——直接执行即可。仅当步骤因不可恢复错误或强制停止触发条件失败时才停止。在每个步骤里程碑处打印一行状态更新,以便用户跟踪进度。

Workflow

工作流

Execute the loop in this order. Full detail lives in the reference files cited per step.
  1. Pre-Flight. Run every check in
    references/pre-flight.md
    . Resolve workspace, specs, CSVs, checkpoints, container images, stage the pre-gen pool once, and print the Pre-Flight Summary. Hard stop on any missing input.
  2. Baseline. If
    deft_state.json
    already has
    iterations.baseline.stage_completed == "train"
    and a
    best_ckpt_path
    pointing at an existing file (the upstream
    tao-run-automl-deft-pipeline
    pre-seeds these from its Phase 1 AutoML winner — see its Phase 1 → Phase 2 handoff), skip the train sub-step and resume at
    inference -> evaluate
    against the pre-seeded checkpoint. Otherwise run
    train -> inference -> evaluate
    by invoking the
    tao-skill-bank:tao-train-visual-changenet
    skill. Either way, then
    rca
    by invoking
    tao-skill-bank:tao-analyze-gaps-visual-changenet
    . Read
    references/visual-changenet.md
    and
    references/tao-analyze-gaps-visual-changenet.md
    first for DEFT-loop-specific args (mounts, output dirs,
    deft_state.json
    updates).
  3. Iterate. For each iteration up to
    max_iterations
    , execute Pipeline steps 1-7 in
    references/pipeline.md
    . Between every step, re-read
    results/loop_log.jsonl
    tail +
    results/deft_state.json
    from disk — disk is canonical.
  4. Stop when the KPI target is met,
    max_iterations
    is reached, or a hard-stop gate fires (silent-drop, AMP allocation mismatch, train/val leakage). Never auto-retry hard stops.
  5. Render
    results/DEFT_Loop_Report.html
    after each completed iteration (and once more at loop end) by spawning the
    reporter
    subagent (
    agents/reporter.md
    ). Per-stage renders are not done — every stage already appends one line to
    loop_log.jsonl
    , which is enough for a tail-watching user; the HTML render carries an iteration's worth of state and one render per iteration keeps the per-loop token cost roughly linear in iteration count, not in stage count. Do not render inline.
All pipeline stages run inline in the parent context — the parent invokes the underlying
tao-skill-bank:*
skills directly via the Skill tool, layering DEFT-loop conventions on top via the matching
references/*.md
file. The only delegated work is HTML report rendering, handled by the
reporter
subagent in a fresh context so an end-of-loop render is never silently dropped when the parent's context is saturated.
按以下顺序执行循环。每个步骤的详细信息请参考引用的文件。
  1. 飞行前检查。运行
    references/pre-flight.md
    中的所有检查项。解析工作区、配置文件、CSV、检查点、容器镜像,一次性预准备预生成图像池,并打印飞行前摘要。若存在任何缺失的输入则强制停止。
  2. 基线阶段。如果
    deft_state.json
    中已存在
    iterations.baseline.stage_completed == "train"
    ,且
    best_ckpt_path
    指向一个现有文件(上游
    tao-run-automl-deft-pipeline
    会从其第一阶段AutoML获胜模型预填充这些内容——详见其第一阶段到第二阶段的交接说明),跳过训练子步骤,直接从针对预填充检查点的
    推理 -> 评估
    步骤恢复。否则,通过调用
    tao-skill-bank:tao-train-visual-changenet
    技能运行
    训练 -> 推理 -> 评估
    。无论哪种情况,之后都要通过调用
    tao-skill-bank:tao-analyze-gaps-visual-changenet
    执行
    rca
    。在执行前请先阅读
    references/visual-changenet.md
    references/tao-analyze-gaps-visual-changenet.md
    中针对DEFT循环的特定参数(挂载、输出目录、
    deft_state.json
    更新)。
  3. 迭代阶段。在
    max_iterations
    的范围内,为每次迭代执行
    references/pipeline.md
    中的流程步骤1-7。在每个步骤之间,重新从磁盘读取
    results/loop_log.jsonl
    的尾部内容 +
    results/deft_state.json
    ——磁盘内容为权威来源。
  4. 停止条件。当KPI目标达成、达到
    max_iterations
    或触发强制停止条件(静默中断、AMP分配不匹配、训练/验证数据泄露)时停止。强制停止时永不自动重试。
  5. 生成报告。在每次迭代完成后(以及循环结束时再生成一次),通过启动
    reporter
    子Agent(
    agents/reporter.md
    )生成
    results/DEFT_Loop_Report.html
    。不针对每个阶段单独生成报告——每个阶段已向
    loop_log.jsonl
    追加一行内容,这对实时查看日志的用户来说已足够;HTML报告包含一次迭代的完整状态,且每次迭代生成一次报告可使整个循环的令牌成本大致与迭代次数呈线性关系,而非与阶段次数相关。请勿在线内生成报告。
所有流程阶段均在父上下文内在线执行——父Agent通过Skill工具直接调用底层
tao-skill-bank:*
技能,并通过对应的
references/*.md
文件叠加DEFT循环约定。唯一的委托工作是HTML报告生成,由
reporter
子Agent在新的上下文中处理,这样当父上下文饱和时,循环结束时的报告生成不会被静默丢弃。

Defaults

默认参数

Set only when the user does not supply them; never ask about a parameter with a default. Full list in
references/pre-flight.md
.
  • max_iterations
    : 3 —
    top_k_per_target
    : 5 —
    min_similarity
    : 0.9 (cosine cutoff)
  • training_epochs
    :
    num_epochs
    from
    specs/baseline_spec.yaml
    , else 20
  • workspace root: user prompt, else
    ~/workspace
仅当用户未提供时设置;永远不要询问带有默认值的参数。完整列表请见
references/pre-flight.md
  • max_iterations
    : 3 —
    top_k_per_target
    : 5 —
    min_similarity
    : 0.9(余弦相似度阈值)
  • training_epochs
    : 取自
    specs/baseline_spec.yaml
    中的
    num_epochs
    ,否则为20
  • 工作区根目录:用户指令中指定的路径,否则为
    ~/workspace

Reference Map

参考映射

ReferenceOwns
references/pre-flight.md
Pre-Flight checks 1-11, full defaults list, Pre-Flight Summary template + the one user gate. Workspace/spec/CSV/checkpoint/image resolution,
.env
+
versions.yaml
credential resolution, GPU memory sanity (batch_size ≤ 16 on 48GB / ≤ 8 on 24GB), one-shot pre-gen staging, leakage check.
references/pipeline.md
Pipeline steps 1-7 + Augmentation Pool. RCA → route (pre-gen single-bucket promote-all-gaps,
filter_by_label: false
, no AG fanout) → read cached manifest → k-NN mine (
top_k_per_target
,
min_similarity 0.9
, no SDG bypass) → assemble CSV → validate → fine-tune (
automl_policy: off
). Source-pool assembly, per-iter mining bounds, 14-column / 4-mandatory-column CSV schema, baseline skip-train logic.
references/stage-execution.md
Available Scripts table, Stage Reference Modules (stage→skill map), path-rule invariant, SKILL/INLINE/AGENT stage types, post-stage check, report artifacts,
agents/reporter.md
spawn contract.
references/state-logging.md
deft_state.json
+
loop_log.jsonl
contracts, one entry per stage,
seq = last_seq + 1
from disk (disk canonical, never
echo
/inline
jq
), per-iteration + loop-end render cadence, loop-end sequence (
log_stage
align_token_usage
→ render →
prepare_inference_spec
), stop conditions.
references/prepare-for-inference.md
best_model.json
+
best_model_inference_spec.yaml
contract and consumer workflow.
references/REPORT_RENDERING.md
Template fill rules followed by
agents/reporter.md
.
references/SCRIPT_USAGE.md
run_script()
vs direct
python
, absolute-path resolution.
Read the relevant reference at the start of each stage, then act. If a reference file is missing, stop and ask the user to reinstall the plugin — do not substitute generic shell commands.
参考文件负责内容
references/pre-flight.md
飞行前检查1-11项、完整默认参数列表、飞行前摘要模板以及唯一的用户确认环节。工作区/配置文件/CSV/检查点/镜像解析、
.env
+
versions.yaml
凭证解析、GPU内存合理性检查(48GB GPU上batch_size ≤ 16 / 24GB GPU上≤ 8)、一次性预生成图像池准备、数据泄露检查。
references/pipeline.md
流程步骤1-7 + 增强池。RCA → 路由(预生成单桶全缺口提升、
filter_by_label: false
、无AG分支)→ 读取缓存清单 → k-NN挖掘(
top_k_per_target
min_similarity 0.9
、无SDG绕过)→ 组装CSV → 验证 → 微调(
automl_policy: off
)。源池组装、每次迭代的挖掘范围、14列/4必填列CSV schema、基线跳过训练逻辑。
references/stage-execution.md
可用脚本表、阶段参考模块(阶段→技能映射)、路径规则不变量、SKILL/INLINE/AGENT阶段类型、阶段后检查、报告产物、
agents/reporter.md
启动约定。
references/state-logging.md
deft_state.json
+
loop_log.jsonl
约定、每个阶段一条记录、从磁盘读取
seq = last_seq + 1
(磁盘内容为权威来源,绝不使用
echo
/在线
jq
)、每次迭代+循环结束时的报告生成节奏、循环结束序列(
log_stage
align_token_usage
→ 生成报告 →
prepare_inference_spec
)、停止条件。
references/prepare-for-inference.md
best_model.json
+
best_model_inference_spec.yaml
约定以及消费工作流。
references/REPORT_RENDERING.md
agents/reporter.md
遵循的模板填充规则。
references/SCRIPT_USAGE.md
run_script()
与直接
python
调用的区别、绝对路径解析。
在每个阶段开始时阅读相关参考文件,然后执行操作。若参考文件缺失,请停止操作并要求用户重新安装插件——不要替换为通用Shell命令。

Data Contract

数据约定

Inputs (all paths under
<workspace>
unless absolute):
text
<workspace>/
├── .env                                     # NGC_API_KEY (nvcr.io/* image pulls), HF_TOKEN (HuggingFace pre-flight pulls). No AnomalyGen credentials required — this EA variant ingests pre-generated pairs.
├── specs/baseline_spec.yaml                 # ChangeNet train/eval spec
├── train/base/
│   ├── training_set.csv                     # seed training rows; ChangeNet 14-column siamese schema
│   └── validation_set.csv                   # held-out rows; checked for leakage against every train CSV
├── kpi/
│   ├── images/                              # KPI test images (real data only — no generated images here)
│   └── testing_set.csv                      # labels live in the CSV
├── augmentation/
│   ├── mining_pool/
│   │   ├── mining_pool.csv                  # append-only production-line samples; paths relative to this dir
│   │   └── images/                          # source images referenced by mining_pool.csv (e.g. *_SolderLight.jpg)
│   └── anomalygen/                          # customer-supplied pre-generated synthetic pairs (this EA variant does not run AnomalyGen)
│       ├── reconstructed_image/             # NG images (will become ChangeNet input_path); flat dir of *.jpg or *.png
│       ├── original_image/                  # OK partner images, same stems as reconstructed_image/ (will become ChangeNet golden_path)
│       └── defect_spec.jsonl                # OPTIONAL — one entry per defect_type if defect-type accounting is wanted in deft_state.json
│                                            # Stems in reconstructed_image/ and original_image/ must match 1-to-1; extensions may differ.
└── results/run_<YYYYMMDD_HHMMSS>/           # created/resumed by this workflow (= ${RESULTS_DIR})
ChangeNet CSV schema (VCN). Mandatory columns:
input_path
,
golden_path
,
label
,
object_name
(siamese change-detector — a row without
golden_path
is unusable). Preserve
boardname
, scores, and provenance fields when present. TAO builds the full image path as
{images_dir}/{input_path}/{object_name}_{light}{image_ext}
input_path
is a directory, not a file.
输入(除非是绝对路径,否则所有路径均位于
<workspace>
下):
text
<workspace>/
├── .env                                     # NGC_API_KEY(用于拉取nvcr.io/*镜像)、HF_TOKEN(用于HuggingFace飞行前拉取)。无需AnomalyGen凭证——本EA变体直接导入预生成图像对。
├── specs/baseline_spec.yaml                 # ChangeNet训练/评估配置文件
├── train/base/
│   ├── training_set.csv                     # 种子训练数据行;ChangeNet 14列孪生网络schema
│   └── validation_set.csv                   # 保留数据行;会检查与所有训练CSV是否存在数据泄露
├── kpi/
│   ├── images/                              # KPI测试图像(仅真实数据——此处无生成图像)
│   └── testing_set.csv                      # 标签存储在CSV中
├── augmentation/
│   ├── mining_pool/
│   │   ├── mining_pool.csv                  # 可追加的生产线样本;路径相对于此目录
│   │   └── images/                          # mining_pool.csv引用的源图像(例如*_SolderLight.jpg)
│   └── anomalygen/                          # 客户提供的预生成合成图像对(本EA变体不运行AnomalyGen)
│       ├── reconstructed_image/             # NG图像(将作为ChangeNet的input_path);存放*.jpg或*.png的扁平目录
│       ├── original_image/                  # OK配对图像,与reconstructed_image/中的图像文件名前缀一致(将作为ChangeNet的golden_path)
│       └── defect_spec.jsonl                # 可选——若需要在deft_state.json中统计缺陷类型,则每条记录对应一种defect_type
│                                            # reconstructed_image/和original_image/中的图像文件名前缀必须一一对应;扩展名可以不同。
└── results/run_<YYYYMMDD_HHMMSS>/           # 由本工作流创建/恢复(= ${RESULTS_DIR})
ChangeNet CSV schema(VCN)。必填列:
input_path
golden_path
label
object_name
(孪生网络变化检测器——缺少
golden_path
的行无法使用)。若存在
boardname
、分数和来源字段,请予以保留。TAO会构建完整图像路径为
{images_dir}/{input_path}/{object_name}_{light}{image_ext}
——
input_path
是目录,而非文件。

Output Layout

输出结构

Relative to
<workspace>
:
text
results/run_<YYYYMMDD_HHMMSS>/               # = ${RESULTS_DIR}
├── deft_state.json                          # current resume snapshot (schema: references/deft_state.json)
├── loop_log.jsonl                           # append-only stage log; single source of truth
├── DEFT_Loop_Report.html                    # re-rendered after every stage by agents/reporter.md
├── best_model.json                          # inference handoff metadata (see references/prepare-for-inference.md)
├── best_model_inference_spec.yaml           # ready-to-run TAO inference spec built from training config
├── iter${ITER}_summary.md                   # ≤300-word per-iteration summary
├── synth_pool/                              # built ONCE at Pre-Flight step 10 via scripts/prestage_pregen.py
│   ├── manifest.json                        # paths + counts for the loop to reference
│   ├── images/synth_{ng,ok}/                # ChangeNet-staged pre-gen pairs (single copy, shared across iters)
│   ├── sdg_rows.csv                         # 14-col + provenance + filepath; the SDG half of source_pool
│   ├── source_pool.{csv,parquet}            # real (mining_pool) + sdg unified pool with provenance
│   ├── source_embeddings.parquet            # written only when --embed-with-siglip was passed to prestage_pregen.py
│   └── source_embed.log                     # data-services log for the source embedding (if run)
├── baseline/
│   ├── train/                               # TAO train output: model_epoch_<EEE>_step_<SSS>.pth × N, status.json, experiment.yaml, train.log
│   ├── inference/{best_val,latest}/         # per-checkpoint inference.csv + KPI plots from scripts/analyze_kpi.py
│   └── rca_results/<TS>/                    # kpi_gaps.parquet, threshold.txt, weak_samples_breakdown.txt
└── iter${ITER}/
    ├── routing_results/<TS>/                # mining_gaps.parquet, anomalygen_gaps.parquet, routing_summary.txt
    ├── anomalygen/                          # per-iter bookkeeping (just records the synth_pool/manifest.json path)
    │   └── ingest_summary.json              # per-iter audit: which synth_pool manifest was reused, counts at iter start
    ├── mining_filter/
    │   ├── mining_pool.csv                  # top-K-per-target k-NN survivors from synth_pool/source_pool (synth + real subject to same filter)
    │   ├── knn_summary.csv                  # candidate_count, kept_count, rejected_count, similarity_threshold=0.9
    │   ├── target_embeddings.parquet        # embeddings of weak-target images (per-iter — targets change each iter)
    │   └── mining_summary.txt               # per-label breakdown emitted by mining container
    ├── dataset/
    │   ├── train_combined_iter${ITER}.csv
    │   └── train_combined_iter${ITER}_provenance.csv  # source ∈ {base_train, previous_iter_train, mining_pool}
    ├── train/                               # TAO train output for iter${ITER}
    ├── inference/{best_val,latest}/
    └── rca_results/<TS>/                    # next iteration's RCA reads inference/{best_val|latest}/inference.csv
A previous combined CSV's rows already include every prior contribution — assemble iter N+1 from
train_combined_iter${N}.csv
plus the new
mining_filter/mining_pool.csv
, not from
train/base/training_set.csv
again.
相对于
<workspace>
text
results/run_<YYYYMMDD_HHMMSS>/               # = ${RESULTS_DIR}
├── deft_state.json                          # 当前恢复快照(schema:references/deft_state.json)
├── loop_log.jsonl                           # 可追加的阶段日志;唯一权威来源
├── DEFT_Loop_Report.html                    # 由agents/reporter.md在每个阶段后重新生成
├── best_model.json                          # 推理交接元数据(详见references/prepare-for-inference.md)
├── best_model_inference_spec.yaml           # 基于训练配置构建的可直接运行的TAO推理配置文件
├── iter${ITER}_summary.md                   # 每次迭代的摘要(≤300字)
├── synth_pool/                              # 在飞行前步骤10中通过scripts/prestage_pregen.py一次性构建
│   ├── manifest.json                        # 供循环引用的路径+计数信息
│   ├── images/synth_{ng,ok}/                # 按ChangeNet格式整理的预生成图像对(单份副本,所有迭代共享)
│   ├── sdg_rows.csv                         # 14列+来源+文件路径;source_pool中的SDG部分
│   ├── source_pool.{csv,parquet}            # 真实数据(mining_pool)+ sdg的统一池,包含来源信息
│   ├── source_embeddings.parquet            # 仅当向prestage_pregen.py传入--embed-with-siglip时才会生成
│   └── source_embed.log                     # 源嵌入的数据服务日志(若运行)
├── baseline/
│   ├── train/                               # TAO训练输出:model_epoch_<EEE>_step_<SSS>.pth × N、status.json、experiment.yaml、train.log
│   ├── inference/{best_val,latest}/         # 每个检查点的inference.csv + 由scripts/analyze_kpi.py生成的KPI图表
│   └── rca_results/<TS>/                    # kpi_gaps.parquet、threshold.txt、weak_samples_breakdown.txt
└── iter${ITER}/
    ├── routing_results/<TS>/                # mining_gaps.parquet、anomalygen_gaps.parquet、routing_summary.txt
    ├── anomalygen/                          # 每次迭代的记录(仅记录synth_pool/manifest.json路径)
    │   └── ingest_summary.json              # 每次迭代的审计信息:复用了哪个synth_pool清单、迭代开始时的计数
    ├── mining_filter/
    │   ├── mining_pool.csv                  # 从synth_pool/source_pool中筛选出的每个目标的Top-K k-NN结果(合成数据和真实数据遵循相同筛选规则)
    │   ├── knn_summary.csv                  # candidate_count、kept_count、rejected_count、similarity_threshold=0.9
    │   ├── target_embeddings.parquet        # 弱目标图像的嵌入(每次迭代——目标会变化)
    │   └── mining_summary.txt               # 挖掘容器输出的按标签分类的明细
    ├── dataset/
    │   ├── train_combined_iter${ITER}.csv
    │   └── train_combined_iter${ITER}_provenance.csv  # 来源 ∈ {base_train, previous_iter_train, mining_pool}
    ├── train/                               # 第${ITER}次迭代的TAO训练输出
    ├── inference/{best_val,latest}/
    └── rca_results/<TS>/                    # 下一次迭代的RCA会读取inference/{best_val|latest}/inference.csv
之前的合并CSV已包含所有历史贡献——第N+1次迭代的CSV由
train_combined_iter${N}.csv
加上新的
mining_filter/mining_pool.csv
组装而成,而非再次从
train/base/training_set.csv
组装。

Safety & Gating

安全与准入

  • One user gate. The Pre-Flight Summary in
    references/pre-flight.md
    is the only confirmation point. Stop and wait for explicit approval before any side-effecting step; autonomous after.
  • Path rule. Every stage writes absolute host paths under
    ${RESULTS_DIR}/iter${ITER}/
    ; reject any config with
    output: /results/...
    or any path outside
    <workspace>
    . See Invariants in
    references/stage-execution.md
    .
  • Disk is canonical. Re-read
    loop_log.jsonl
    tail +
    deft_state.json
    before every stage; append exactly one
    loop_log.jsonl
    entry per stage via
    scripts/log_stage.py
    (never
    echo
    /inline
    jq
    ). See
    references/state-logging.md
    .
  • Hard stops, never auto-retried: missing/empty/unpaired pre-gen dirs, missing or zero-row
    mining_pool.csv
    , mid-run pre-gen mutation, train/val leakage (mid-iteration and post-assembly checks), silent-drop, AMP allocation mismatch, CSV validation failure, missing reference file.
  • No SDG bypass. Synthetic rows go through the same k-NN as real rows; the loop never launches an SDG/AnomalyGen container in this EA variant.
  • 唯一用户确认环节
    references/pre-flight.md
    中的飞行前摘要是唯一的确认点。在执行任何产生副作用的步骤前,停止操作并等待明确批准;获得批准后自主运行。
  • 路径规则。每个阶段均将绝对主机路径写入
    ${RESULTS_DIR}/iter${ITER}/
    下;拒绝任何包含
    output: /results/...
    或任何
    <workspace>
    外路径的配置。详见
    references/stage-execution.md
    中的「不变量」部分。
  • 磁盘内容为权威来源。在每个阶段开始前重新读取
    loop_log.jsonl
    尾部内容 +
    deft_state.json
    ;通过
    scripts/log_stage.py
    为每个阶段追加一条
    loop_log.jsonl
    记录(绝不使用
    echo
    /在线
    jq
    )。详见
    references/state-logging.md
  • 强制停止,永不自动重试:预生成目录缺失/为空/未配对、
    mining_pool.csv
    缺失或行数为0、运行中预生成数据被修改、训练/验证数据泄露(迭代中及组装后检查)、静默中断、AMP分配不匹配、CSV验证失败、参考文件缺失。
  • 无SDG绕过。合成数据行与真实数据行经过相同的k-NN筛选;本EA变体的循环永远不会启动SDG/AnomalyGen容器。