error-recoverer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Error Recoverer

错误恢复器

Detects, classifies, and recovers from errors during autonomous coding sessions.
可在自主编码会话中检测、分类并恢复错误。

Quick Start

快速开始

Handle Error

处理错误

python
from scripts.error_recoverer import ErrorRecoverer

recoverer = ErrorRecoverer(project_dir)
result = await recoverer.handle_error(error, context)

if result.recovered:
    print(f"Recovered via: {result.strategy}")
else:
    print(f"Failed: {result.reason}")
python
from scripts.error_recoverer import ErrorRecoverer

recoverer = ErrorRecoverer(project_dir)
result = await recoverer.handle_error(error, context)

if result.recovered:
    print(f"Recovered via: {result.strategy}")
else:
    print(f"Failed: {result.reason}")

Automatic Recovery

自动恢复

python
@recoverer.with_recovery
async def risky_operation():
    # Operation that might fail
    pass
python
@recoverer.with_recovery
async def risky_operation():
    # Operation that might fail
    pass

Error Recovery Workflow

错误恢复工作流

┌─────────────────────────────────────────────────────────────┐
│                    ERROR RECOVERY FLOW                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. DETECT                                                  │
│     ├─ Catch exception                                     │
│     ├─ Parse error message                                 │
│     └─ Extract error context                               │
│                                                             │
│  2. CLASSIFY                                                │
│     ├─ Determine error category                            │
│     ├─ Assess severity level                               │
│     └─ Check if recoverable                                │
│                                                             │
│  3. STRATEGIZE                                              │
│     ├─ Query causal memory for similar errors              │
│     ├─ Select recovery strategy                            │
│     └─ Prepare recovery action                             │
│                                                             │
│  4. RECOVER                                                 │
│     ├─ Execute recovery strategy                           │
│     ├─ Verify recovery success                             │
│     └─ Store error→solution chain                          │
│                                                             │
│  5. ESCALATE (if recovery fails)                           │
│     ├─ Rollback to checkpoint                              │
│     ├─ Create detailed error report                        │
│     └─ Signal for human intervention                       │
│                                                             │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│                    ERROR RECOVERY FLOW                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 检测                                                  │
│     ├─ 捕获异常                                     │
│     ├─ 解析错误信息                                 │
│     └─ 提取错误上下文                               │
│                                                             │
│  2. 分类                                                │
│     ├─ 确定错误类别                            │
│     ├─ 评估严重程度                               │
│     └─ 检查是否可恢复                                │
│                                                             │
│  3. 制定策略                                              │
│     ├─ 查询因果内存以查找类似错误              │
│     ├─ 选择恢复策略                            │
│     └─ 准备恢复操作                             │
│                                                             │
│  4. 恢复                                                 │
│     ├─ 执行恢复策略                           │
│     ├─ 验证恢复成功                             │
│     └─ 存储错误→解决方案链路                          │
│                                                             │
│  5. 升级(若恢复失败)                           │
│     ├─ 回滚到检查点                              │
│     ├─ 创建详细错误报告                        │
│     └─ 发出人工干预请求                       │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Error Categories

错误类别

CategoryExamplesRecovery Strategy
TransientNetwork timeout, rate limitRetry with backoff
ResourceFile not found, permission deniedFix path/permissions
SyntaxParse error, invalid JSONFix syntax errors
LogicTest failure, assertion errorDebug and fix code
EnvironmentMissing dependency, version mismatchInstall/update deps
UnrecoverableDisk full, OOMEscalate immediately
类别示例恢复策略
临时错误网络超时、速率限制带退避的重试
资源错误文件未找到、权限拒绝修复路径/权限
语法错误解析错误、无效JSON修复语法错误
逻辑错误测试失败、断言错误调试并修复代码
环境错误依赖缺失、版本不匹配安装/更新依赖
不可恢复错误磁盘已满、内存不足立即升级处理

Recovery Strategies

恢复策略

python
class RecoveryStrategy(Enum):
    RETRY = "retry"              # Simple retry
    RETRY_BACKOFF = "backoff"    # Exponential backoff
    ROLLBACK = "rollback"        # Restore checkpoint
    FIX_AND_RETRY = "fix_retry"  # Apply fix, then retry
    SKIP = "skip"                # Skip and continue
    ESCALATE = "escalate"        # Human intervention
python
class RecoveryStrategy(Enum):
    RETRY = "retry"              # Simple retry
    RETRY_BACKOFF = "backoff"    # Exponential backoff
    ROLLBACK = "rollback"        # Restore checkpoint
    FIX_AND_RETRY = "fix_retry"  # Apply fix, then retry
    SKIP = "skip"                # Skip and continue
    ESCALATE = "escalate"        # Human intervention

Integration Points

集成点

  • memory-manager: Query/store causal chains
  • checkpoint-manager: Rollback on failure
  • coding-agent: Provide fixes for code errors
  • progress-tracker: Log error metrics
  • memory-manager: 查询/存储因果链
  • checkpoint-manager: 失败时回滚
  • coding-agent: 为代码错误提供修复方案
  • progress-tracker: 记录错误指标

References

参考资料

  • references/ERROR-CATEGORIES.md
    - Error classification
  • references/RECOVERY-STRATEGIES.md
    - Strategy details
  • references/ERROR-CATEGORIES.md
    - 错误分类说明
  • references/RECOVERY-STRATEGIES.md
    - 策略详细说明

Scripts

脚本

  • scripts/error_recoverer.py
    - Core recovery logic
  • scripts/error_classifier.py
    - Error classification
  • scripts/retry_handler.py
    - Retry with backoff
  • scripts/recovery_strategies.py
    - Strategy implementations
  • scripts/error_recoverer.py
    - 核心恢复逻辑
  • scripts/error_classifier.py
    - 错误分类器
  • scripts/retry_handler.py
    - 带退避的重试处理器
  • scripts/recovery_strategies.py
    - 策略实现