backtest-expert
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBacktest Expert
回测专家
Systematic approach to backtesting trading strategies based on professional methodology that prioritizes robustness over optimistic results.
基于专业方法论的交易策略系统化回测方法,优先考虑鲁棒性而非乐观的纸面结果。
Core Philosophy
核心理念
Goal: Find strategies that "break the least", not strategies that "profit the most" on paper.
Principle: Add friction, stress test assumptions, and see what survives. If a strategy holds up under pessimistic conditions, it's more likely to work in live trading.
目标:找到“最不容易失效”的策略,而非纸面“利润最高”的策略。
原则:增加摩擦、压力测试假设,观察哪些策略能留存。如果一个策略在悲观条件下仍能保持表现,那么它在实盘交易中更有可能奏效。
When to Use This Skill
何时使用此技能
Use this skill when:
- Developing or validating systematic trading strategies
- Evaluating whether a trading idea is robust enough for live implementation
- Troubleshooting why a backtest might be misleading
- Learning proper backtesting methodology
- Avoiding common pitfalls (curve-fitting, look-ahead bias, survivorship bias)
- Assessing parameter sensitivity and regime dependence
- Setting realistic expectations for slippage and execution costs
在以下场景使用此技能:
- 开发或验证系统化交易策略
- 评估交易想法是否足够鲁棒以用于实盘执行
- 排查回测结果可能存在误导性的原因
- 学习正确的回测方法论
- 避免常见陷阱(曲线拟合、前瞻偏差、生存偏差)
- 评估参数敏感性和市场周期依赖性
- 为滑点和执行成本设定合理预期
Backtesting Workflow
回测工作流程
1. State the Hypothesis
1. 提出假设
Define the edge in one sentence.
Example: "Stocks that gap up >3% on earnings and pull back to previous day's close within first hour provide mean-reversion opportunity."
If you can't articulate the edge clearly, don't proceed to testing.
用一句话定义交易优势。
示例:“财报发布后跳空高开>3%且在首个小时内回落至前一日收盘价的股票,存在均值回归机会。”
如果无法清晰阐述交易优势,请勿进入测试环节。
2. Codify Rules with Zero Discretion
2. 编写无主观判断的规则
Define with complete specificity:
- Entry: Exact conditions, timing, price type
- Exit: Stop loss, profit target, time-based exit
- Position sizing: Fixed $$, % of portfolio, volatility-adjusted
- Filters: Market cap, volume, sector, volatility conditions
- Universe: What instruments are eligible
Critical: No subjective judgment allowed. Every decision must be rule-based and unambiguous.
明确定义以下内容:
- 入场:精确条件、时机、价格类型
- 出场:止损、止盈、基于时间的出场规则
- 仓位管理:固定金额、占投资组合比例、波动率调整
- 筛选条件:市值、成交量、行业、波动率要求
- 标的范围:符合条件的交易品种
关键要求:不允许主观判断。每一个决策都必须基于明确的规则,不存在歧义。
3. Run Initial Backtest
3. 运行初始回测
Test over:
- Minimum 5 years (preferably 10+)
- Multiple market regimes (bull, bear, high/low volatility)
- Realistic costs: Commissions + conservative slippage
Examine initial results for basic viability. If fundamentally broken, iterate on hypothesis.
在以下条件下测试:
- 至少5年(最好10年以上)的历史数据
- 多种市场周期(牛市、熊市、高/低波动率)
- 真实成本:佣金+保守滑点
检查初始结果的基本可行性。如果从根本上不可行,则迭代假设。
4. Stress Test the Strategy
4. 压力测试策略
This is where 80% of testing time should be spent.
Parameter sensitivity:
- Test stop loss at 50%, 75%, 100%, 125%, 150% of baseline
- Test profit target at 80%, 90%, 100%, 110%, 120% of baseline
- Vary entry/exit timing by ±15-30 minutes
- Look for "plateaus" of stable performance, not narrow spikes
Execution friction:
- Increase slippage to 1.5-2x typical estimates
- Model worst-case fills (buy at ask+1 tick, sell at bid-1 tick)
- Add realistic order rejection scenarios
- Test with pessimistic commission structures
Time robustness:
- Analyze year-by-year performance
- Require positive expectancy in majority of years
- Ensure strategy doesn't rely on 1-2 exceptional periods
- Test in different market regimes separately
Sample size:
- Absolute minimum: 30 trades
- Preferred: 100+ trades
- High confidence: 200+ trades
这部分应占据80%的测试时间。
参数敏感性测试:
- 测试止损为基准值的50%、75%、100%、125%、150%
- 测试止盈为基准值的80%、90%、100%、110%、120%
- 将入场/出场时机前后调整±15-30分钟
- 寻找表现稳定的“平台区间”,而非狭窄的峰值
执行摩擦测试:
- 将滑点提高至典型估计值的1.5-2倍
- 模拟最差成交情况(以卖一价+1个tick买入,以买一价-1个tick卖出)
- 添加真实的订单拒绝场景
- 用悲观的佣金结构测试
时间鲁棒性测试:
- 分析逐年表现
- 要求大部分年份都有正收益预期
- 确保策略不依赖1-2个特殊时期
- 分别在不同市场周期下测试
样本量要求:
- 绝对最小值:30笔交易
- 推荐值:100+笔交易
- 高置信度:200+笔交易
5. Out-of-Sample Validation
5. 样本外验证
Walk-forward analysis:
- Optimize on training period (e.g., Year 1-3)
- Test on validation period (Year 4)
- Roll forward and repeat
- Compare in-sample vs out-of-sample performance
Warning signs:
- Out-of-sample <50% of in-sample performance
- Need frequent parameter re-optimization
- Parameters change dramatically between periods
滚动窗口分析:
- 在训练期(如第1-3年)优化参数
- 在验证期(第4年)测试
- 滚动窗口并重复上述步骤
- 比较样本内与样本外表现
警示信号:
- 样本外表现仅为样本内的50%以下
- 需要频繁重新优化参数
- 参数在不同时期发生巨大变化
6. Evaluate Results
6. 评估结果
Questions to answer:
- Does edge survive pessimistic assumptions?
- Is performance stable across parameter variations?
- Does strategy work in multiple market regimes?
- Is sample size sufficient for statistical confidence?
- Are results realistic, not "too good to be true"?
Decision criteria:
- ✅ Deploy: Survives all stress tests with acceptable performance
- 🔄 Refine: Core logic sound but needs parameter adjustment
- ❌ Abandon: Fails stress tests or relies on fragile assumptions
需要回答的问题:
- 交易优势在悲观假设下是否依然存在?
- 表现是否在参数变化范围内保持稳定?
- 策略在多种市场周期下是否有效?
- 样本量是否足够达到统计置信度?
- 结果是否真实合理,而非“好得离谱”?
决策标准:
- ✅ 部署:通过所有压力测试且表现可接受
- 🔄 优化:核心逻辑合理但需调整参数
- ❌ 放弃:未通过压力测试或依赖脆弱假设
Key Testing Principles
核心测试原则
Punish the Strategy
严格考验策略
Add friction everywhere:
- Commissions higher than reality
- Slippage 1.5-2x typical
- Worst-case fills
- Order rejections
- Partial fills
Rationale: Strategies that survive pessimistic assumptions often outperform in live trading.
全方位增加摩擦:
- 佣金高于实际水平
- 滑点为典型值的1.5-2倍
- 最差成交情况
- 订单拒绝
- 部分成交
原理:能在悲观假设下留存的策略,在实盘交易中往往表现更优。
Seek Plateaus, Not Peaks
寻找平台区间,而非峰值
Look for parameter ranges where performance is stable, not optimal values that create performance spikes.
Good: Strategy profitable with stop loss anywhere from 1.5% to 3.0%
Bad: Strategy only works with stop loss at exactly 2.13%
Stable performance indicates genuine edge; narrow optima suggest curve-fitting.
寻找表现稳定的参数范围,而非能创造表现峰值的最优参数值。
良好情况:止损在1.5%至3.0%之间时策略均盈利
糟糕情况:仅当止损恰好为2.13%时策略有效
稳定的表现表明存在真实的交易优势;狭窄的最优值则暗示曲线拟合。
Test All Cases, Not Cherry-Picked Examples
测试所有情况,而非挑选特例
Wrong approach: Study hand-picked "market leaders" that worked
Right approach: Test every stock that met criteria, including those that failed
Selective examples create survivorship bias and overestimate strategy quality.
错误做法:研究被精心挑选的“成功案例”
正确做法:测试所有符合条件的股票,包括那些失败的案例
选择性案例会导致生存偏差,高估策略质量。
Separate Idea Generation from Validation
将想法生成与验证分离
Intuition: Useful for generating hypotheses
Validation: Must be purely data-driven
Never let attachment to an idea influence interpretation of test results.
直觉:适用于生成假设
验证:必须完全基于数据驱动
绝不要因为对某个想法的偏好而影响对测试结果的解读。
Common Failure Patterns
常见失效模式
Recognize these patterns early to save time:
- Parameter sensitivity: Only works with exact parameter values
- Regime-specific: Great in some years, terrible in others
- Slippage sensitivity: Unprofitable when realistic costs added
- Small sample: Too few trades for statistical confidence
- Look-ahead bias: "Too good to be true" results
- Over-optimization: Many parameters, poor out-of-sample results
See for detailed examples and diagnostic framework.
references/failed_tests.md尽早识别这些模式以节省时间:
- 参数敏感性:仅在特定参数值下有效
- 周期特异性:某些年份表现极佳,其他年份表现极差
- 滑点敏感性:加入真实成本后无利可图
- 样本量过小:交易笔数不足,无法达到统计置信度
- 前瞻偏差:“好得离谱”的结果
- 过度优化:参数过多,样本外表现差
见获取详细示例和诊断框架。
references/failed_tests.mdAvailable Reference Documentation
可用参考文档
Methodology Reference
方法论参考
File:
references/methodology.mdWhen to read: For detailed guidance on specific testing techniques.
Contents:
- Stress testing methods
- Parameter sensitivity analysis
- Slippage and friction modeling
- Sample size requirements
- Market regime classification
- Common biases and pitfalls (survivorship, look-ahead, curve-fitting, etc.)
文件:
references/methodology.md阅读时机:需要特定测试技术的详细指导时。
内容:
- 压力测试方法
- 参数敏感性分析
- 滑点与摩擦建模
- 样本量要求
- 市场周期分类
- 常见偏差与陷阱(生存偏差、前瞻偏差、曲线拟合等)
Failed Tests Reference
失效测试参考
File:
references/failed_tests.mdWhen to read: When strategy fails tests, or learning from past mistakes.
Contents:
- Why failures are valuable
- Common failure patterns with examples
- Case study documentation framework
- Red flags checklist for evaluating backtests
文件:
references/failed_tests.md阅读时机:策略测试失败时,或从过往错误中学习时。
内容:
- 失效案例的价值
- 常见失效模式及示例
- 案例研究文档框架
- 回测评估警示信号清单
Critical Reminders
重要提醒
Time allocation: Spend 20% generating ideas, 80% trying to break them.
Context-free requirement: If strategy requires "perfect context" to work, it's not robust enough for systematic trading.
Red flag: If backtest results look too good (>90% win rate, minimal drawdowns, perfect timing), audit carefully for look-ahead bias or data issues.
Tool limitations: Understand your backtesting platform's quirks (interpolation methods, handling of low liquidity, data alignment issues).
Statistical significance: Small edges require large sample sizes to prove. 5% edge per trade needs 100+ trades to distinguish from luck.
时间分配:20%的时间用于生成想法,80%的时间用于尝试推翻它们。
无场景依赖要求:如果策略需要“完美场景”才能生效,那么它的鲁棒性不足以支持系统化交易。
警示信号:如果回测结果好得离谱(胜率>90%、回撤极小、时机完美),请仔细检查是否存在前瞻偏差或数据问题。
工具局限性:了解你的回测平台的特性(插值方法、低流动性处理、数据对齐问题)。
统计显著性:微小的交易优势需要大样本量来证明。每笔交易5%的优势需要100+笔交易才能区分于运气。
Discretionary vs Systematic Differences
主观交易与系统化交易的区别
This skill focuses on systematic/quantitative backtesting where:
- All rules are codified in advance
- No discretion or "feel" in execution
- Testing happens on all historical examples, not cherry-picked cases
- Context (news, macro) is deliberately stripped out
Discretionary traders study differently—this skill may not apply to setups requiring subjective judgment.
本技能专注于系统化/量化回测,其中:
- 所有规则均提前编写完成
- 执行过程中无主观判断或“感觉”
- 测试覆盖所有历史案例,而非挑选特例
- 刻意剔除场景因素(新闻、宏观经济)
主观交易者的学习方法不同——本技能可能不适用于需要主观判断的交易场景。