why
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFive Whys Analysis
5WHY分析
Apply Five Whys root cause analysis to investigate issues by iteratively asking "why" to drill from symptoms to root causes.
运用5WHY根因分析方法,通过反复询问“为什么”,从问题表象深挖至根本原因。
Description
说明
Iteratively ask "why" to move from surface symptoms to fundamental causes. Identifies systemic issues rather than quick fixes.
通过反复询问“为什么”,从表面问题逐步深入到根本原因。该方法着眼于识别系统性问题,而非仅提供快速修复方案。
Usage
使用方法
/why [issue_description]/why [issue_description]Variables
变量
- ISSUE: Problem or symptom to analyze (default: prompt for input)
- DEPTH: Number of "why" iterations (default: 5, adjust as needed)
- ISSUE:待分析的问题或表象(默认:提示输入)
- DEPTH:“为什么”的迭代次数(默认:5次,可按需调整)
Steps
步骤
- State the problem clearly
- Ask "Why did this happen?" and document the answer
- For that answer, ask "Why?" again
- Continue until reaching root cause (usually 5 iterations)
- Validate by working backwards: root cause → symptom
- Explore branches if multiple causes emerge
- Propose solutions addressing root causes, not symptoms
- 清晰陈述问题
- 询问“为什么会发生这种情况?”并记录答案
- 针对上一步的答案,再次询问“为什么?”
- 持续迭代直至找到根本原因(通常为5次)
- 通过反向验证:从根本原因推导至问题表象
- 若出现多个原因,探索不同分支
- 提出针对根本原因的解决方案,而非仅解决表象问题
Examples
示例
Example 1: Production Bug
示例1:生产环境Bug
Problem: Users see 500 error on checkout
Why 1: Payment service throws exception
Why 2: Request timeout after 30 seconds
Why 3: Database query takes 45 seconds
Why 4: Missing index on transactions table
Why 5: Index creation wasn't in migration scripts
Root Cause: Migration review process doesn't check query performance
Solution: Add query performance checks to migration PR template问题:用户在结账时看到500错误
为什么1:支付服务抛出异常
为什么2:请求超时(30秒后)
为什么3:数据库查询耗时45秒
为什么4:交易表缺少索引
为什么5:索引创建未包含在迁移脚本中
根本原因:迁移审核流程未检查查询性能
解决方案:在迁移PR模板中添加查询性能检查Example 2: CI/CD Pipeline Failures
示例2:CI/CD流水线失败
Problem: E2E tests fail intermittently
Why 1: Race condition in async test setup
Why 2: Test doesn't wait for database seed completion
Why 3: Seed function doesn't return promise
Why 4: TypeScript didn't catch missing return type
Why 5: strict mode not enabled in test config
Root Cause: Inconsistent TypeScript config between src and tests
Solution: Unify TypeScript config, enable strict mode everywhere问题:E2E测试间歇性失败
为什么1:异步测试设置中存在竞态条件
为什么2:测试未等待数据库种子数据加载完成
为什么3:种子函数未返回Promise
为什么4:TypeScript未捕获缺失的返回类型
为什么5:测试配置未启用严格模式
根本原因:源码与测试的TypeScript配置不一致
解决方案:统一TypeScript配置,全局启用严格模式Example 3: Multi-Branch Analysis
示例3:多分支分析
Problem: Feature deployment takes 2 hours
Branch A (Build):
Why 1: Docker build takes 90 minutes
Why 2: No layer caching
Why 3: Dependencies reinstalled every time
Why 4: Cache invalidated by timestamp in Dockerfile
Root Cause A: Dockerfile uses current timestamp for versioning
Branch B (Tests):
Why 1: Test suite takes 30 minutes
Why 2: Integration tests run sequentially
Why 3: Test runner config has maxWorkers: 1
Why 4: Previous developer disabled parallelism due to flaky tests
Root Cause B: Flaky tests masked by disabling parallelism
Solutions:
A) Remove timestamp from Dockerfile, use git SHA
B) Fix flaky tests, re-enable parallel test execution问题:功能部署耗时2小时
分支A(构建环节):
为什么1:Docker构建耗时90分钟
为什么2:未使用分层缓存
为什么3:每次构建都重新安装依赖
为什么4:Dockerfile中的时间戳导致缓存失效
根本原因A:Dockerfile使用当前时间戳进行版本控制
分支B(测试环节):
为什么1:测试套件耗时30分钟
为什么2:集成测试串行执行
为什么3:测试运行器配置的maxWorkers为1
为什么4:之前的开发者因测试不稳定而禁用了并行执行
根本原因B:测试不稳定问题被禁用并行执行所掩盖
解决方案:
A) 移除Dockerfile中的时间戳,使用Git SHA
B) 修复不稳定测试,重新启用并行测试执行Notes
注意事项
- Don't stop at symptoms; keep digging for systemic issues
- Multiple root causes may exist - explore different branches
- Document each "why" for future reference
- Consider both technical and process-related causes
- The magic isn't in exactly 5 whys - stop when you reach the true root cause
- Stop when you hit systemic/process issues, not just technical details
- Multiple root causes are common—explore branches separately
- If "human error" appears, keep digging: why was error possible?
- Document every "why" for future reference
- Root cause usually involves: missing validation, missing docs, unclear process, or missing automation
- Test solutions: implement → verify symptom resolved → monitor for recurrence
- 不要停留在表象,持续深挖系统性问题
- 可能存在多个根本原因——探索不同分支
- 记录每一次“为什么”的问答,以备后续参考
- 同时考虑技术层面和流程层面的原因
- 关键不在于严格执行5次提问,当找到真正的根本原因时即可停止
- 当触及系统性/流程问题时停止,而非仅停留在技术细节
- 多个根本原因很常见——分别探索不同分支
- 如果出现“人为错误”,继续深挖:为什么会出现这种错误?
- 记录每一次“为什么”的问答,以备后续参考
- 根本原因通常包括:缺失验证、缺失文档、流程不清晰或缺失自动化
- 测试解决方案:实施→验证问题表象已解决→监控是否复发