harness-step3-session-management
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHarness Step 3: 建立跨 Session 状态管理
Harness Step 3: Establish Cross-Session State Management
目标
Objectives
创建三个文件,让 agent 在任何新 session 开始时能在 30 秒内恢复工作状态:
- :环境初始化脚本,验证项目可以正常启动
init.sh - :当前任务清单,agent 的工作指令来源
tasks.json - :人类可读的进度摘要,记录每次 session 的关键信息
progress.md
核心原则:状态靠文件传递,不靠 agent 的记忆。git log 是主记录,这三个文件是辅助。
Create three files to enable agents to resume work status within 30 seconds at the start of any new session:
- : Environment initialization script to verify the project can start normally
init.sh - : Current task list, the source of work instructions for agents
tasks.json - : Human-readable progress summary that records key information from each session
progress.md
Core Principle: State is transferred via files, not agent memory. git log is the primary record, and these three files are supplementary.
执行步骤
Implementation Steps
Step 1:扫描项目启动方式
Step 1: Scan Project Startup Methods
在写 之前,先确认项目如何启动和测试:
init.shbash
undefinedBefore writing , first confirm how the project starts and is tested:
init.shbash
undefined读 package.json 的 scripts(Node.js 项目)
Read scripts in package.json (Node.js projects)
cat package.json 2>/dev/null | grep -A 20 '"scripts"'
cat package.json 2>/dev/null | grep -A 20 '"scripts"'
或读 Makefile(多语言项目)
Or read Makefile (multi-language projects)
cat Makefile 2>/dev/null | head -40
cat Makefile 2>/dev/null | head -40
或读 pyproject.toml(Python 项目)
Or read pyproject.toml (Python projects)
cat pyproject.toml 2>/dev/null | grep -A 20 '[tool.poetry.scripts]'
cat pyproject.toml 2>/dev/null | grep -A 20 '[tool.poetry.scripts]'
确认现有 AGENTS.md 里的启动命令
Confirm startup commands in existing AGENTS.md
grep -A 5 '启动命令|start|dev|run' AGENTS.md 2>/dev/null
收集:
- 开发服务器启动命令
- 测试命令
- 类型检查/lint 命令(如果有)
- 有没有需要先跑的初始化步骤(如数据库迁移)
---grep -A 5 '启动命令|start|dev|run' AGENTS.md 2>/dev/null
Collect:
- Development server startup command
- Test command
- Type checking/lint command (if available)
- Any initialization steps that need to be run first (such as database migration)
---Step 2:创建 init.sh
init.shStep 2: Create init.sh
init.shinit.shbash
#!/bin/bashThe role of : Run at the start of each session to quickly verify if the environment is normal, and fix issues immediately if not before proceeding.
init.shbash
#!/bin/bashinit.sh — 每次 session 开始时运行
init.sh — Run at the start of each session
验证开发环境处于可工作状态
Verify that the development environment is in a working state
set -e # 任何步骤失败就停止
echo "=== 检查环境 ==="
set -e # Stop if any step fails
echo "=== Checking Environment ==="
1. 确认在正确目录
1. Confirm in the correct directory
echo "工作目录: $(pwd)"
echo "Working Directory: $(pwd)"
2. 安装依赖(如果 node_modules 不存在)
2. Install dependencies (if node_modules does not exist)
[根据技术栈选择,以下是示例]
[Choose based on tech stack, the following is an example]
Node.js:
Node.js:
if [ ! -d "node_modules" ]; then
echo "安装依赖..."
npm install
fi
if [ ! -d "node_modules" ]; then
echo "Installing dependencies..."
npm install
fi
3. 冒烟测试:验证项目能正常启动
3. Smoke test: Verify the project can start normally
[根据项目实际情况写,目标是用最快的方式验证基本功能正常]
[Write according to actual project situation, the goal is to verify basic functionality in the fastest way]
示例:跑一个最快的测试
Example: Run the fastest test
npm run test -- --testPathPattern=smoke 2>/dev/null || echo "警告:冒烟测试失败,请先修复"
npm run test -- --testPathPattern=smoke 2>/dev/null || echo "Warning: Smoke test failed, please fix first"
echo "=== 环境检查完成,可以开始工作 ==="
echo "提示:运行 'git log --oneline -10' 查看最近工作历史"
**写作要求**:
- 根据扫描到的实际启动命令填写,不要留示例注释
- 冒烟测试要快(< 30秒),目的是快速发现环境问题,不是跑完整测试套件
- 如果项目有数据库,加一步检查数据库连接是否正常
- 写完后实际运行一遍,确认脚本无报错:`bash init.sh`
---echo "=== Environment check completed, ready to start working ==="
echo "Tip: Run 'git log --oneline -10' to view recent work history"
**Writing Requirements**:
- Fill in according to the actual startup commands scanned, do not leave example comments
- Smoke tests should be fast (< 30 seconds), aimed at quickly detecting environment issues rather than running the full test suite
- If the project has a database, add a step to check if the database connection is normal
- Run the script after writing to confirm no errors: `bash init.sh`
---Step 3:创建 tasks.json
tasks.jsonStep 3: Create tasks.json
tasks.json结构设计:
json
{
"project": "[项目名]",
"last_updated": "[今天日期,格式 YYYY-MM-DD]",
"current_focus": "[当前最重要的一件事,一句话]",
"tasks": [
{
"id": "[模块缩写]-[序号]",
"title": "[任务标题]",
"description": "[具体做什么,1-3句话]",
"status": "pending | in_progress | done | blocked",
"priority": "high | medium | low",
"blocked_by": "[阻塞原因,仅 blocked 状态时填写]",
"verify": "[如何验证这个任务完成了]",
"requires_eval": false
}
]
}字段说明(每次新增任务时必须逐字段填写,不能省略):
| 字段 | 是否必填 | 说明 |
|---|---|---|
| 必填 | 模块缩写 + 序号,如 |
| 必填 | 任务标题,一句话 |
| 必填 | 具体做什么,1-3 句话 |
| 必填 | 初始值为 |
| 必填 | |
| 仅 blocked 时填 | 阻塞原因 |
| 必填 | 如何验证完成,必须是可执行的步骤(命令或操作) |
| 必填 | 是否需要独立 Evaluator 评审,默认 |
requires_eval设为 的条件,满足任意一条即需要评审:
true- 这是一个新功能(不只是修 bug 或改配置)
- 涉及安全、权限、数据校验相关逻辑
- 预计会修改 3 个以上文件
- 任务描述里有"重构"或"架构调整"
设为 的条件(以下全部满足才可以跳过评审):
false- 纯 bug 修复,改动范围明确
- 文档更新、注释补充
- 配置调整、环境变量修改
- 单元测试补充
如何确定初始任务列表:
优先从以下来源提取:
- 里的计划文件(如果有)
docs/exec-plans/active/ - 里的高优先级债务
docs/exec-plans/tech-debt-tracker.md - README 里提到的 TODO 或路线图
- 询问用户:「当前最想推进的 3-5 个任务是什么?」
写作要求:
- 字段必须是可执行的步骤,不能写"确认功能正常"这种废话
verify - 任务粒度:一个任务应该在 1-2 小时内完成,太大的拆分
- 初始状态:所有任务都是 ,由 agent 工作时更新
pending
询问用户(如果无法从现有文档推断任务):
我已经扫描了项目,准备创建任务清单。请告诉我: 当前最想推进的 3-5 个任务是什么? 每个任务用一句话描述就行。
Structure Design:
json
{
"project": "[Project Name]",
"last_updated": "[Today's Date, Format YYYY-MM-DD]",
"current_focus": "[The most important thing right now, one sentence]",
"tasks": [
{
"id": "[Module Abbreviation]-[Serial Number]",
"title": "[Task Title]",
"description": "[What to do specifically, 1-3 sentences]",
"status": "pending | in_progress | done | blocked",
"priority": "high | medium | low",
"blocked_by": "[Blocking reason, only fill in when status is blocked]",
"verify": "[How to verify this task is completed]",
"requires_eval": false
}
]
}Field Explanation (Must fill in each field when adding a new task, cannot omit):
| Field | Required | Explanation |
|---|---|---|
| Required | Module abbreviation + serial number, e.g., |
| Required | Task title, one sentence |
| Required | What to do specifically, 1-3 sentences |
| Required | Initial value is |
| Required | |
| Only when blocked | Blocking reason |
| Required | How to verify completion, must be executable steps (commands or operations) |
| Required | Whether independent Evaluator review is needed, default |
requires_evalSet to if any of the following conditions are met:
true- This is a new feature (not just bug fixes or configuration changes)
- Involves security, permissions, or data validation-related logic
- Expected to modify more than 3 files
- The task description includes "refactoring" or "architecture adjustment"
Set to only if all of the following conditions are met:
false- Pure bug fix with clear scope of changes
- Document updates, comment additions
- Configuration adjustments, environment variable modifications
- Unit test additions
How to Determine the Initial Task List:
Prioritize extracting from the following sources:
- Plan files in (if available)
docs/exec-plans/active/ - High-priority debts in
docs/exec-plans/tech-debt-tracker.md - TODOs or roadmaps mentioned in README
- Ask the user: "What are the 3-5 tasks you most want to advance right now?"
Writing Requirements:
- The field must be executable steps, cannot write vague phrases like "confirm functionality is normal"
verify - Task granularity: A task should be completed within 1-2 hours, split larger tasks
- Initial status: All tasks are , updated by the agent during work
pending
Ask the User (If tasks cannot be inferred from existing documents):
I have scanned the project and am ready to create the task list. Please tell me: What are the 3-5 tasks you most want to advance right now? Just describe each task in one sentence.
Step 4:创建 progress.md
progress.mdStep 4: Create progress.md
progress.md初始内容:
markdown
undefinedInitial content:
markdown
undefined项目进度记录
Project Progress Record
每次 session 完成任务后,在顶部追加记录。不要删除历史。 格式:## [日期] [任务名]
After completing tasks in each session, append records at the top. Do not delete history. Format: ## [Date] [Task Name]
[今天日期] 初始化 Harness
[Today's Date] Initialize Harness
- 完成 harness-step1:建立 docs/ 骨架
- 完成 harness-step2:填充知识库内容
- 完成 harness-step3:建立状态管理
- tasks.json 初始任务数:[N] 个
- 下次从这里开始:读 tasks.json,选 priority=high 且 status=pending 的任务开始
---- Completed harness-step1: Established docs/ skeleton
- Completed harness-step2: Filled knowledge base content
- Completed harness-step3: Established state management
- Number of initial tasks in tasks.json: [N]
- Next start here: Read tasks.json, select tasks with priority=high and status=pending to start
---Step 4b:更新 AGENTS.md
—— 写入任务管理规则
AGENTS.mdStep 4b: Update AGENTS.md
— Write Task Management Rules
AGENTS.md找到 里 step2 写入的"每次完成一个任务后"部分,替换为以下内容:
AGENTS.mdmarkdown
undefinedFind the section "After completing each task" written in step2 of , replace it with the following content:
AGENTS.mdmarkdown
undefined新增任务时,必须:
When adding tasks, you must:
- 填写 tasks.json 里的所有字段,不能省略
- 对照以下标准判断 ,不能默认填 false 不加思考:
requires_eval- 新功能 / 涉及安全权限 / 改动超过 3 个文件 / 重构 → true
- 纯 bug 修复 / 文档更新 / 配置调整 → false
- Fill in all fields in tasks.json, cannot omit
- Judge against the following criteria, cannot default to false without thinking:
requires_eval- New features / involving security permissions / modifying more than 3 files / refactoring → true
- Pure bug fixes / document updates / configuration adjustments → false
每次完成一个任务后,必须按顺序执行:
After completing each task, you must execute in order:
- 执行 里该任务
tasks.json字段描述的验证步骤verify - 若该任务 为
requires_eval:填写true,等待 Evaluator 评审通过后才能标记sprint_output.md若该任务done为requires_eval:验证通过即可标记falsedone - git commit,格式:
type(scope): 做了什么,遗留了什么(如有) - 在 顶部追加本次记录
progress.md
禁止:跳过 verify 步骤自行判断任务已完成。
禁止:不经判断直接把 设为 false。
requires_eval
---- Perform the verification steps described in the field of the task in tasks.json
verify - If the task's is
requires_eval: Fill intrue, wait for Evaluator review approval before marking assprint_output.mdIf the task'sdoneisrequires_eval: Mark asfalseonce verification passesdone - git commit, format:
type(scope): What was done, any leftovers (if applicable) - Append this session's record at the top of
progress.md
Prohibited: Skipping the verify step and judging task completion on your own.
Prohibited: Setting to false without judgment.
requires_eval
---Step 5:验证整体联动
Step 5: Verify Overall Integration
三个文件创建完后,模拟一次完整的 session 启动流程,验证联动是否正常:
bash
undefinedAfter creating the three files, simulate a complete session startup process to verify normal integration:
bash
undefined模拟 agent 新 session 开始的操作序列
Simulate the operation sequence when an agent starts a new session
echo "=== 模拟新 session 启动 ==="
echo "=== Simulating New Session Startup ==="
1. 跑 init.sh
1. Run init.sh
bash init.sh
bash init.sh
2. 看 git log
2. Check git log
git log --oneline -10
git log --oneline -10
3. 读 progress.md(确认文件存在且可读)
3. Read progress.md (Confirm the file exists and is readable)
head -20 progress.md
head -20 progress.md
4. 读 tasks.json(确认格式正确)
4. Read tasks.json (Confirm correct format)
cat tasks.json | python3 -m json.tool > /dev/null && echo "tasks.json 格式正确" || echo "tasks.json 格式有误"
全部通过才算完成。
---cat tasks.json | python3 -m json.tool > /dev/null && echo "tasks.json format is correct" || echo "tasks.json format is incorrect"
Only pass all steps to complete.
---质量检验
Quality Inspection
- 实际运行无报错?
init.sh - JSON 格式合法?每个任务都有
tasks.json和verify字段?requires_eval - 每个任务的 是否对照判断标准填写,而非默认 false?
requires_eval - 有初始记录?
progress.md - 里是否包含新增任务和完成任务的两条规则?
AGENTS.md
- Does run without errors?
init.sh - Is in valid JSON format? Does each task have
tasks.jsonandverifyfields?requires_eval - Is filled in according to the judgment criteria for each task, not defaulting to false?
requires_eval - Does have initial records?
progress.md - Does include the two rules for adding and completing tasks?
AGENTS.md
完成后告知用户
Inform User After Completion
输出摘要:
创建的文件:
- :[描述做了什么检查]
init.sh - :[N] 个任务,其中需要 Evaluator 评审的 [N] 个
tasks.json - :已初始化
progress.md
如何使用:
现在你可以把项目交给 Claude Code 了。它每次启动时会自动读这三个文件 + git log, 恢复工作状态。你不需要每次都解释"上次做到哪里了"。
需要你做的事:
- 检查 里的任务列表是否符合你的预期,可以手动增删
tasks.json - 确认 的判断是否合理
requires_eval - 如果 有步骤失败,告诉我,我来修复
init.sh
下一步:
- Harness 地基已完成(step1 + step2 + step3)
- 可以开始正式使用 Claude Code 开发
- 遇到 agent 反复违反代码规范时,运行 ,把规则变成机械约束
harness-step4-linter - 遇到 agent 自评不可信时,运行 ,引入独立评审
harness-step5-evaluator
Output summary:
Created Files:
- : [Describe what checks are performed]
init.sh - : [N] tasks, [N] of which require Evaluator review
tasks.json - : Initialized
progress.md
How to Use:
Now you can hand the project over to Claude Code. It will automatically read these three files + git log each time it starts, and resume work status. You don't need to explain "where we left off last time" every time.
What You Need to Do:
- Check if the task list in meets your expectations, you can manually add or delete tasks
tasks.json - Confirm if the judgments are reasonable
requires_eval - If any step in fails, let me know and I will fix it
init.sh
Next Steps:
- Harness foundation is completed (step1 + step2 + step3)
- You can start formal development with Claude Code
- When agents repeatedly violate code specifications, run to turn rules into mechanical constraints
harness-step4-linter - When agent self-assessment is untrustworthy, run to introduce independent review
harness-step5-evaluator