soak-test
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSoak Test
Soak Test
A soak test (also called an endurance test) is an extended play session run
with specific observation goals. Unlike a smoke check (broad critical path,
~10 min) or a single-feature playtest (~30 min), a soak test runs for 30
minutes to several hours to surface:
- Memory leaks — gradual heap growth that only appears after scene transitions
- Performance drift — frame time degradation that worsens over time
- State accumulation bugs — issues that only appear after N repetitions of a mechanic (inventory full, score overflow, AI state corruption)
- Fun fatigue — mechanics that feel good in a first session but grow repetitive over extended play
- Content exhaustion — the point where players run out of novel content
This skill generates the observation protocol and analysis harness — the
human does the actual playing.
Output:
production/qa/soak-test-[date]-[duration].mdWhen to run:
- Polish phase — before
/gate-check release - After fixing a memory or stability issue (regression soak)
- When extended play has not been formally tracked
Soak Test(也称为耐力测试)是一项带有特定观察目标的长时间游戏测试会话。与冒烟测试(覆盖核心路径,约10分钟)或单一功能测试(约30分钟)不同,Soak Test 的运行时长为30分钟至数小时,旨在发现:
- 内存泄漏 — 仅在场景切换后出现的堆内存缓慢增长
- 性能漂移 — 随时间推移逐渐恶化的帧时长问题
- 状态累积漏洞 — 重复执行某一机制N次后才会出现的问题(如背包已满、分数溢出、AI状态损坏)
- 趣味性疲劳 — 首次体验时良好,但长时间游戏后变得重复乏味的机制
- 内容耗尽 — 玩家无新内容可体验的节点
本技能负责生成观察方案和分析框架 — 实际测试由人工完成。
输出文件:
production/qa/soak-test-[date]-[duration].md运行时机:
- Polish阶段 — 在执行之前
/gate-check release - 修复内存或稳定性问题后(回归测试Soak)
- 未对长时间游戏进行正式跟踪时
1. Parse Arguments
1. 解析参数
Duration (default: ):
1h- — short soak; suitable for testing a single mechanic or scene
30m - — standard soak; covers most common leak categories
1h - — extended soak; recommended for first full Polish soak
2h - — deep soak; required for games with long session design (RPGs, sims)
4h
Focus (default: ):
all- — focus on heap size, object count, leak patterns
memory - — focus on crash/freeze/hang detection
stability - — focus on fun fatigue, content exhaustion, difficulty perception
balance - — all of the above
all
时长(默认值:):
1h- — 短时长Soak;适用于测试单一机制或场景
30m - — 标准时长Soak;覆盖大多数常见泄漏类型
1h - — 延时时长Soak;首次完整Polish阶段测试推荐使用
2h - — 深度Soak;针对长会话设计的游戏(如RPG、模拟类游戏)必需
4h
测试重点(默认值:):
all- — 重点关注堆内存大小、对象数量、泄漏模式
memory - — 重点关注崩溃/冻结/卡顿检测
stability - — 重点关注趣味性疲劳、内容耗尽、难度感知
balance - — 涵盖以上所有内容
all
2. Load Context
2. 加载上下文
Read:
- — engine (for engine-specific memory monitoring guidance), performance budgets (memory ceiling, target FPS)
.claude/docs/technical-preferences.md - — intended session length (for comparison against soak duration), core loop description
design/gdd/game-concept.md - Most recent file in — prior playtest findings (to avoid re-documenting known issues)
production/playtests/ - Most recent file in — current sprint test coverage (to understand what has been formally tested vs. what the soak covers)
production/qa/qa-plan-*.md
Note any performance budget targets from technical-preferences.md:
- Memory ceiling: [N MB, or "not set"]
- Target FPS: [N, or "not set"]
- Frame budget: [N ms, or "not set"]
读取以下文件:
- — 引擎信息(用于特定引擎的内存监控指导)、性能预算(内存上限、目标FPS)
.claude/docs/technical-preferences.md - — 预期会话时长(与Soak时长对比)、核心循环描述
design/gdd/game-concept.md - 目录下的最新文件 — 之前的测试发现(避免重复记录已知问题)
production/playtests/ - 目录下的最新文件 — 当前迭代的测试覆盖范围(了解已正式测试内容与Soak测试覆盖内容的差异)
production/qa/qa-plan-*.md
记录中的性能预算目标:
technical-preferences.md- 内存上限:[N MB,或“未设置”]
- 目标FPS:[N,或“未设置”]
- 帧时长预算:[N ms,或“未设置”]
3. Define Observation Checkpoints
3. 定义观察检查点
Based on duration, generate timed checkpoints:
30m soak: T+0, T+10, T+20, T+30
1h soak: T+0, T+15, T+30, T+45, T+60
2h soak: T+0, T+20, T+40, T+60, T+80, T+100, T+120
4h soak: T+0, T+30, T+60, T+90, T+120, T+180, T+240
At each checkpoint, the observer records the observation items defined in
Phase 4.
根据时长生成定时检查点:
30分钟Soak:T+0、T+10、T+20、T+30
1小时Soak:T+0、T+15、T+30、T+45、T+60
2小时Soak:T+0、T+20、T+40、T+60、T+80、T+100、T+120
4小时Soak:T+0、T+30、T+60、T+90、T+120、T+180、T+240
在每个检查点,测试人员需记录第4阶段定义的观察项。
4. Generate the Soak Test Protocol
4. 生成Soak Test测试方案
Memory / Stability observation items (if focus = memory or all)
内存/稳定性观察项(当测试重点为memory或all时)
Engine-specific monitoring guidance:
Godot 4:
- Open Debugger → Monitors tab; track and
Memory → Static Memoryacross checkpointsObject Count → Objects - Record: Static Memory (KB), Object Count, Orphan Nodes count
- Alert threshold: Memory growth > 20% from T+0 after the first 15 minutes (some growth on load is expected; sustained growth indicates a leak)
- Note: returns bytes in Godot 4.6
Performance.get_monitor(Performance.MEMORY_STATIC)
Unity:
- Open Memory Profiler (Window → Analysis → Memory Profiler)
- Record: Total Reserved Memory (MB), GC Allocated (MB), Object Count at each checkpoint
- Alert threshold: GC Allocated growing monotonically across 3+ checkpoints
Unreal Engine:
- Use console command at each checkpoint
stat memory - Record: Physical Memory Used (MB), Physical Memory Available
- Alert threshold: Physical Memory Used growth > 50MB over the full soak
引擎特定监控指导:
Godot 4:
- 打开Debugger → Monitors标签页;跟踪各检查点的和
Memory → Static MemoryObject Count → Objects - 记录:静态内存(KB)、对象数量、孤立节点数量
- 告警阈值:前15分钟后,内存增长超过T+0时的20%(加载时的部分增长属于正常情况;持续增长表明存在泄漏)
- 注意:在Godot 4.6中,返回字节数
Performance.get_monitor(Performance.MEMORY_STATIC)
Unity:
- 打开Memory Profiler(Window → Analysis → Memory Profiler)
- 记录:每个检查点的总预留内存(MB)、GC分配内存(MB)、对象数量
- 告警阈值:GC分配内存连续3个及以上检查点单调增长
Unreal Engine:
- 在每个检查点使用控制台命令
stat memory - 记录:已使用物理内存(MB)、可用物理内存
- 告警阈值:整个Soak测试过程中,已使用物理内存增长超过50MB
Stability observation items (if focus = stability or all)
稳定性观察项(当测试重点为stability或all时)
At each checkpoint, note:
- No crash, hang, or freeze occurred since last checkpoint
- Frame rate still within target budget ([target FPS] fps)
- Audio still playing correctly (no desync or silence)
- All HUD elements still rendering correctly
- Input responding as expected (no input loss or lag spike)
在每个检查点,记录:
- 自上一个检查点以来未发生崩溃、冻结或卡顿
- 帧率仍在目标预算范围内([目标FPS] fps)
- 音频播放正常(无不同步或静音)
- 所有HUD元素渲染正常
- 输入响应符合预期(无输入丢失或延迟峰值)
Balance / fatigue observation items (if focus = balance or all)
平衡性/疲劳度观察项(当测试重点为balance或all时)
Collect subjective observations at each checkpoint:
- Core mechanic still feels rewarding (Y/N)
- Perceived difficulty level: [too easy / appropriate / too hard]
- Any "I've seen this before" moments since last checkpoint? (novel content exhaustion)
- Any moment of frustration since last checkpoint? Note cause.
- Any moment of peak engagement since last checkpoint? Note cause.
在每个检查点收集主观观察结果:
- 核心机制仍具有吸引力(是/否)
- 感知难度:[太简单 / 合适 / 太难]
- 自上一个检查点以来是否出现“似曾相识”的时刻?(新内容耗尽)
- 自上一个检查点以来是否出现沮丧时刻?记录原因。
- 自上一个检查点以来是否出现峰值参与时刻?记录原因。
5. Generate the Protocol Document
5. 生成测试方案文档
markdown
undefinedmarkdown
undefinedSoak Test Protocol
Soak Test Protocol
Date: [date] Duration: [duration] Focus: [memory | stability | balance | all] Engine: [engine] Generated by: /soak-test
Date: [date] Duration: [duration] Focus: [memory | stability | balance | all] Engine: [engine] Generated by: /soak-test
Pre-Session Setup
Pre-Session Setup
Before starting the soak:
- Game is running from a fresh launch (not resumed from a prior session)
- All background applications closed (minimise OS memory interference)
- Performance monitoring tool open and recording:
- Godot: Debugger → Monitors tab → Memory section visible
- Unity: Memory Profiler window open
- Unreal: ready in console
stat memory
- Soak target confirmed: [session design intent from game concept]
- Prior known issues to watch for: [from most recent playtest / qa-plan]
Before starting the soak:
- Game is running from a fresh launch (not resumed from a prior session)
- All background applications closed (minimise OS memory interference)
- Performance monitoring tool open and recording:
- Godot: Debugger → Monitors tab → Memory section visible
- Unity: Memory Profiler window open
- Unreal: ready in console
stat memory
- Soak target confirmed: [session design intent from game concept]
- Prior known issues to watch for: [from most recent playtest / qa-plan]
Baseline (T+0) — Record Before Playing
Baseline (T+0) — Record Before Playing
| Metric | Baseline Value |
|---|---|
| Memory / Heap | [record before first frame of gameplay] |
| Object Count | [record] |
| FPS (first 30 seconds) | [record] |
| [Engine-specific metric] | [record] |
| Metric | Baseline Value |
|---|---|
| Memory / Heap | [record before first frame of gameplay] |
| Object Count | [record] |
| FPS (first 30 seconds) | [record] |
| [Engine-specific metric] | [record] |
Checkpoint Log
Checkpoint Log
T+[N] minutes
T+[N] minutes
Memory / Stability (if applicable):
| Metric | Value | Δ from Baseline | Alert? |
|---|---|---|---|
| Memory / Heap | |||
| Object Count | |||
| FPS | |||
| Crashes / Hangs |
Stability checks:
- No crash or hang since last checkpoint
- Frame rate within budget ([N] fps target)
- Audio correct
- HUD rendering correctly
- Input responding correctly
Balance / Fatigue (if applicable):
- Core mechanic still rewarding: Y / N
- Difficulty perception: too easy / appropriate / too hard
- Notable moments: [note any peak engagement or frustration]
- Content exhaustion signs: Y / N — [describe]
Free observations:
(Note anything unexpected observed since the last checkpoint)
[Repeat Checkpoint Log section for each timed checkpoint]
Memory / Stability (if applicable):
| Metric | Value | Δ from Baseline | Alert? |
|---|---|---|---|
| Memory / Heap | |||
| Object Count | |||
| FPS | |||
| Crashes / Hangs |
Stability checks:
- No crash or hang since last checkpoint
- Frame rate within budget ([N] fps target)
- Audio correct
- HUD rendering correctly
- Input responding correctly
Balance / Fatigue (if applicable):
- Core mechanic still rewarding: Y / N
- Difficulty perception: too easy / appropriate / too hard
- Notable moments: [note any peak engagement or frustration]
- Content exhaustion signs: Y / N — [describe]
Free observations:
(Note anything unexpected observed since the last checkpoint)
[Repeat Checkpoint Log section for each timed checkpoint]
Post-Session Analysis
Post-Session Analysis
Memory Trend
Memory Trend
| Checkpoint | Memory | Δ/hr extrapolated |
|---|---|---|
| T+0 | ||
| [T+N] |
Leak detected? Y / N
Estimated time to OOM at current rate: [N hours / not applicable]
| Checkpoint | Memory | Δ/hr extrapolated |
|---|---|---|
| T+0 | ||
| [T+N] |
Leak detected? Y / N
Estimated time to OOM at current rate: [N hours / not applicable]
Stability Summary
Stability Summary
Total crashes: [N]
Total hangs: [N]
Worst FPS observed: [N] fps at [checkpoint]
Performance degradation: stable / mild / severe
Total crashes: [N]
Total hangs: [N]
Worst FPS observed: [N] fps at [checkpoint]
Performance degradation: stable / mild / severe
Balance / Fatigue Summary
Balance / Fatigue Summary
Fun curve: [engaged throughout / fatigue onset at T+N / repetitive from start]
Content exhaustion point: [never / at T+N / early]
Difficulty arc: [appropriate / too easy throughout / difficulty spike at T+N]
Fun curve: [engaged throughout / fatigue onset at T+N / repetitive from start]
Content exhaustion point: [never / at T+N / early]
Difficulty arc: [appropriate / too easy throughout / difficulty spike at T+N]
Issues Found
Issues Found
| ID | Severity | Checkpoint | Description |
|---|---|---|---|
| SOAK-001 | S[1-4] | T+[N] | [description] |
| ID | Severity | Checkpoint | Description |
|---|---|---|---|
| SOAK-001 | S[1-4] | T+[N] | [description] |
Verdict: PASS / PASS WITH CONCERNS / FAIL
Verdict: PASS / PASS WITH CONCERNS / FAIL
PASS: No leaks detected, stability maintained, fun factor consistent
PASS WITH CONCERNS: Minor drift or fatigue noted; addressable in Polish
FAIL: Memory leak confirmed, stability breach, or severe fun fatigue
PASS: No leaks detected, stability maintained, fun factor consistent
PASS WITH CONCERNS: Minor drift or fatigue noted; addressable in Polish
FAIL: Memory leak confirmed, stability breach, or severe fun fatigue
Sign-Off
Sign-Off
- Tester: [name] — [date]
- QA Lead review: [name] — [date]
---- Tester: [name] — [date]
- QA Lead review: [name] — [date]
---6. Write Output
6. 输出内容
Present the protocol summary in conversation, then ask:
"May I write this soak test protocol to
?"
production/qa/soak-test-[date]-[duration].mdWrite only after approval.
After writing:
"Protocol written. To run the soak:
- Open the file and follow the Pre-Session Setup checklist
- Record each checkpoint as you play
- Complete the Post-Session Analysis section when done
- File bugs from 'Issues Found' to
production/qa/bugs/ - Run after the session to integrate any S1/S2 issues
/bug-triage sprint
If the verdict is FAIL, run again after fixing the issues."
/smoke-check在对话中展示方案摘要,然后询问:
"是否可以将此Soak Test测试方案写入?"
production/qa/soak-test-[date]-[duration].md仅在获得批准后写入文件。
写入完成后:
"测试方案已写入。执行Soak测试的步骤:
- 打开文件并遵循会话前设置清单
- 游戏过程中记录每个检查点的内容
- 测试结束后完成会话后分析部分
- 将“发现的问题”中的缺陷提交至
production/qa/bugs/ - 测试会话结束后执行,整合所有S1/S2级别的问题
/bug-triage sprint
如果测试结论为FAIL,修复问题后需再次执行。"
/smoke-checkCollaborative Protocol
协作协议
- This skill generates a protocol — humans run it — never attempt to run a soak test automatically. The observations require a human observer.
- Duration should match the game's session design — a 5-minute game doesn't need a 4h soak; a city-builder might. Use judgment and ask if unclear.
- First soak should be focus — narrow focus (memory-only) is for regression soaks after a specific fix, not the first pass
all - Ask before writing — always confirm before creating the protocol file
- 本技能仅生成测试方案 — 由人工执行测试 — 切勿尝试自动运行Soak Test。观察结果需要人工测试人员完成。
- 时长应与游戏的会话设计匹配 — 一款5分钟的游戏无需4小时的Soak测试;而城市建造类游戏可能需要。如有疑问,请使用判断或咨询相关人员。
- 首次Soak测试应选择重点 — 窄范围重点(仅内存)适用于特定修复后的回归测试,而非首次测试
all - 写入前需确认 — 创建测试方案文件前务必获得批准