monitor-experiment
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMonitor Experiment Results
监控实验结果
Monitor: $ARGUMENTS
监控:$ARGUMENTS
Workflow
工作流程
Step 1: Check What's Running
步骤1:检查正在运行的任务
bash
ssh <server> "screen -ls"bash
ssh <server> "screen -ls"Step 2: Collect Output from Each Screen
步骤2:收集每个Screen会话的输出
For each screen session, capture the last N lines:
bash
ssh <server> "screen -S <name> -X hardcopy /tmp/screen_<name>.txt && tail -50 /tmp/screen_<name>.txt"If hardcopy fails, check for log files or tee output.
针对每个screen会话,捕获最后N行内容:
bash
ssh <server> "screen -S <name> -X hardcopy /tmp/screen_<name>.txt && tail -50 /tmp/screen_<name>.txt"如果hardcopy命令失败,检查日志文件或tee输出。
Step 3: Check for JSON Result Files
步骤3:检查JSON结果文件
bash
ssh <server> "ls -lt <results_dir>/*.json 2>/dev/null | head -20"If JSON results exist, fetch and parse them:
bash
ssh <server> "cat <results_dir>/<latest>.json"bash
ssh <server> "ls -lt <results_dir>/*.json 2>/dev/null | head -20"如果存在JSON结果,获取并解析它们:
bash
ssh <server> "cat <results_dir>/<latest>.json"Step 4: Summarize Results
步骤4:总结结果
Present results in a comparison table:
| Experiment | Metric | Delta vs Baseline | Status |
|-----------|--------|-------------------|--------|
| Baseline | X.XX | — | done |
| Method A | X.XX | +Y.Y | done |以对比表格的形式呈现结果:
| 实验 | 指标 | 与基线的差值 | 状态 |
|-----------|--------|-------------------|--------|
| 基线 | X.XX | — | 完成 |
| 方法A | X.XX | +Y.Y | 完成 |Step 5: Interpret
步骤5:结果解读
- Compare against known baselines
- Flag unexpected results (negative delta, NaN, divergence)
- Suggest next steps based on findings
- 与已知基线进行对比
- 标记异常结果(差值为负、NaN、偏离预期)
- 根据发现建议下一步操作
Step 6: Feishu Notification (if configured)
步骤6:飞书通知(若已配置)
After results are collected, check :
~/.claude/feishu.json- Send notification: results summary table, delta vs baseline
experiment_done - If config absent or mode : skip entirely (no-op)
"off"
收集结果后,检查配置文件:
~/.claude/feishu.json- 发送通知:包含结果汇总表格、与基线的差值
experiment_done - 如果配置文件不存在或模式为:直接跳过(无操作)
"off"
Key Rules
关键规则
- Always show raw numbers before interpretation
- Compare against the correct baseline (same config)
- Note if experiments are still running (check progress bars, iteration counts)
- If results look wrong, check training logs for errors before concluding
- 解读前务必展示原始数据
- 与正确的基线进行对比(相同配置)
- 注意实验是否仍在运行(检查进度条、迭代次数)
- 如果结果看起来异常,先检查训练日志中的错误再下结论