cointegration-analysis
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCointegration Analysis
Cointegration分析
Cointegration testing identifies pairs of assets that share a long-run equilibrium
relationship, enabling statistical arbitrage and pairs trading strategies.
Cointegration测试用于识别具有长期均衡关系的资产对,从而实现统计套利和配对交易策略。
What Is Cointegration?
什么是Cointegration?
Two price series are cointegrated when they are individually non-stationary
(random walks) but a linear combination of them is stationary (mean-reverting).
Intuitively, the prices may wander apart temporarily but are pulled back to an
equilibrium spread over time.
当两个价格序列各自非平稳(随机游走)但它们的线性组合平稳(均值回归)时,这两个序列就是cointegrated。直观来说,价格可能暂时偏离,但随着时间推移会被拉回到均衡价差。
Cointegration vs Correlation
Cointegration vs Correlation
| Property | Correlation | Cointegration |
|---|---|---|
| Measures | Short-term co-movement | Long-run equilibrium |
| Stationarity | Requires stationary returns | Works with non-stationary prices |
| Time horizon | Can change rapidly | Stable over months/years |
| Trading use | Momentum/trend signals | Mean-reversion pairs trades |
| Failure mode | Breaks in regime changes | Breaks on structural shifts |
Two assets can be highly correlated but not cointegrated (e.g., two unrelated
uptrends). Conversely, cointegrated assets may have low short-term correlation
during temporary divergences — which is exactly when pairs trades are entered.
| 属性 | Correlation | Cointegration |
|---|---|---|
| 衡量内容 | 短期联动性 | 长期均衡关系 |
| 平稳性要求 | 需要平稳收益 | 适用于非平稳价格 |
| 时间范围 | 可能快速变化 | 在数月/数年内保持稳定 |
| 交易用途 | 动量/趋势信号 | 均值回归配对交易 |
| 失效模式 | 市场结构变化时失效 | 结构性转变时失效 |
两个资产可能高度相关但不cointegrated(例如,两个不相关的上升趋势)。相反,cointegrated资产在暂时偏离期间可能具有较低的短期相关性——而这正是进入配对交易的时机。
Why It Matters
重要性
- Pairs trading: Long the underperformer, short the outperformer, profit on convergence
- Statistical arbitrage: Systematic mean-reversion on spread z-scores
- Spread trading: Trade the spread directly as a synthetic instrument
- Risk hedging: Cointegrated hedge ratios minimize tracking error over time
- 配对交易:做多表现不佳的资产,做空表现优异的资产,从收敛中获利
- 统计套利:基于价差z-score的系统化均值回归策略
- 价差交易:直接将价差作为合成工具进行交易
- 风险对冲:Cointegration对冲比率可长期最小化跟踪误差
Methods
方法
1. Engle-Granger Two-Step
1. Engle-Granger两步法
The most common approach for two series.
Step 1 — Regress Y on X using OLS:
Y_t = α + β * X_t + ε_tStep 2 — Test the residuals ε_t for stationarity using the ADF test.
- If residuals are stationary (p < 0.05) → Y and X are cointegrated
- β is the hedge ratio for the pairs trade
- α is the long-run mean of the spread
Important: Engle-Granger critical values differ from standard ADF critical
values. For n=2 series: 1% = -3.90, 5% = -3.34, 10% = -3.04.
Asymmetry warning: Testing YX can give a different result than XY. Always
test both directions and use the stronger result.
python
from scipy import stats
import numpy as np
from statsmodels.tsa.stattools import adfuller这是针对两个序列最常用的方法。
步骤1 — 使用OLS回归Y对X:
Y_t = α + β * X_t + ε_t步骤2 — 使用ADF测试残差ε_t的平稳性。
- 如果残差平稳(p < 0.05)→ Y和X是cointegrated
- β是配对交易的对冲比率
- α是价差的长期均值
重要提示:Engle-Granger的临界值与标准ADF临界值不同。对于n=2个序列:1% = -3.90,5% = -3.34,10% = -3.04。
不对称警告:测试YX可能得到与XY不同的结果。务必测试两个方向并使用更强的结果。
python
from scipy import stats
import numpy as np
from statsmodels.tsa.stattools import adfullerStep 1: OLS regression
Step 1: OLS regression
slope, intercept, _, _, _ = stats.linregress(x_prices, y_prices)
hedge_ratio = slope
slope, intercept, _, _, _ = stats.linregress(x_prices, y_prices)
hedge_ratio = slope
Step 2: Test residuals
Step 2: Test residuals
residuals = y_prices - hedge_ratio * x_prices - intercept
adf_stat, p_value, _, _, crit_values, _ = adfuller(residuals, maxlag=None, autolag="AIC")
cointegrated = p_value < 0.05
undefinedresiduals = y_prices - hedge_ratio * x_prices - intercept
adf_stat, p_value, _, _, crit_values, _ = adfuller(residuals, maxlag=None, autolag="AIC")
cointegrated = p_value < 0.05
undefined2. Johansen Test
2. Johansen检验
Tests multiple series simultaneously and returns the number of cointegrating
relationships. More powerful than Engle-Granger for >2 series.
- Based on a VAR model: ΔY_t = Π·Y_{t-1} + Σ Γ_i·ΔY_{t-i} + ε_t
- Tests the rank of the Π matrix
- Uses trace test and maximum eigenvalue test
- Returns: number of cointegrating vectors and the vectors themselves
python
from statsmodels.tsa.vector_ar.vecm import coint_johansen同时测试多个序列并返回协整关系的数量。对于>2个序列,比Engle-Granger更强大。
- 基于VAR模型:ΔY_t = Π·Y_{t-1} + Σ Γ_i·ΔY_{t-i} + ε_t
- 检验Π矩阵的秩
- 使用迹检验和最大特征值检验
- 返回:协整向量的数量和向量本身
python
from statsmodels.tsa.vector_ar.vecm import coint_johansendata: T×N array of price series
data: T×N array of price series
result = coint_johansen(data, det_order=0, k_ar_diff=1)
result = coint_johansen(data, det_order=0, k_ar_diff=1)
Trace statistic vs critical values (90%, 95%, 99%)
Trace statistic vs critical values (90%, 95%, 99%)
trace_stats = result.lr1 # Trace statistics
trace_crit = result.cvt # Critical values
max_eigen_stats = result.lr2 # Max eigenvalue statistics
max_eigen_crit = result.cvm # Critical values
trace_stats = result.lr1 # Trace statistics
trace_crit = result.cvt # Critical values
max_eigen_stats = result.lr2 # Max eigenvalue statistics
max_eigen_crit = result.cvm # Critical values
Cointegrating vectors
Cointegrating vectors
coint_vectors = result.evec
undefinedcoint_vectors = result.evec
undefined3. Phillips-Ouliaris
3. Phillips-Ouliaris检验
Similar to Engle-Granger but uses Phillips-Perron style test statistics
instead of ADF. More robust to heteroskedasticity and serial correlation in
the residuals. Available via .
statsmodels.tsa.stattools.cointpython
from statsmodels.tsa.stattools import coint与Engle-Granger类似,但使用Phillips-Perron风格的统计量而非ADF。对残差中的异方差和序列相关性更稳健。可通过调用。
statsmodels.tsa.stattools.cointpython
from statsmodels.tsa.stattools import cointReturns: test statistic, p-value, critical values
Returns: test statistic, p-value, critical values
t_stat, p_value, crit_values = coint(y_prices, x_prices)
cointegrated = p_value < 0.05
undefinedt_stat, p_value, crit_values = coint(y_prices, x_prices)
cointegrated = p_value < 0.05
undefinedPractical Workflow
实际工作流程
Step 1: Screen Pairs by Correlation
步骤1:通过相关性筛选资产对
Pre-filter using Pearson correlation > 0.7 to reduce the number of
cointegration tests (which are more expensive).
使用皮尔逊相关系数>0.7进行预筛选,以减少协整测试的数量(协整测试成本更高)。
Step 2: Test Cointegration
步骤2:测试Cointegration
Run Engle-Granger in both directions. Use p < 0.05 threshold.
双向运行Engle-Granger检验。使用p < 0.05的阈值。
Step 3: Estimate Hedge Ratio
步骤3:估计对冲比率
Use OLS for simplicity. For production, consider Total Least Squares or
Dynamic OLS (see ).
references/methodology.md为简单起见使用OLS。生产环境中,可考虑总最小二乘法或动态OLS(参见)。
references/methodology.mdStep 4: Compute Spread
步骤4:计算价差
python
spread = y_prices - hedge_ratio * x_prices - intercept
z_score = (spread - spread.mean()) / spread.std()python
spread = y_prices - hedge_ratio * x_prices - intercept
z_score = (spread - spread.mean()) / spread.std()Step 5: Test Spread for Mean Reversion
步骤5:测试价差的均值回归性
- ADF test: p < 0.05 confirms stationarity
- Hurst exponent: H < 0.5 indicates mean reversion (H ≈ 0.5 = random walk)
- Half-life: λ from AR(1) on spread; half-life = -ln(2)/ln(λ)
- Viable pairs: half-life between 5 and 60 days
- ADF检验:p < 0.05确认平稳性
- Hurst指数:H < 0.5表示均值回归(H ≈ 0.5 = 随机游走)
- 半衰期:价差AR(1)模型中的λ;半衰期 = -ln(2)/ln(λ)
- 可行的资产对:半衰期在5到60天之间
Step 6: Trade the Spread
步骤6:交易价差
If the spread is mean-reverting, it is a viable pairs trade candidate.
See for entry/exit rules and risk management.
references/pairs_trading.md如果价差具有均值回归性,则是可行的配对交易候选者。有关入场/离场规则和风险管理,请参见。
references/pairs_trading.mdRolling Cointegration
滚动Cointegration
Cointegration relationships can break down over time due to structural changes,
regime shifts, or evolving market dynamics.
由于结构变化、市场制度转变或市场动态演变,Cointegration关系可能随时间破裂。
Rolling Window Approach
滚动窗口方法
Test cointegration on rolling 60–90 day windows:
python
window = 60
rolling_pvalues = []
rolling_hedges = []
for i in range(window, len(prices)):
y_win = y_prices[i - window:i]
x_win = x_prices[i - window:i]
_, p_val, _ = coint(y_win, x_win)
slope, intercept, _, _, _ = stats.linregress(x_win, y_win)
rolling_pvalues.append(p_val)
rolling_hedges.append(slope)在60–90天的滚动窗口上测试Cointegration:
python
window = 60
rolling_pvalues = []
rolling_hedges = []
for i in range(window, len(prices)):
y_win = y_prices[i - window:i]
x_win = x_prices[i - window:i]
_, p_val, _ = coint(y_win, x_win)
slope, intercept, _, _, _ = stats.linregress(x_win, y_win)
rolling_pvalues.append(p_val)
rolling_hedges.append(slope)Monitoring Signals
监控信号
| Signal | Healthy | Warning | Stop Trading |
|---|---|---|---|
| Rolling p-value | < 0.05 | 0.05–0.10 | > 0.10 |
| Hedge ratio drift | < 10% change | 10–25% change | > 25% change |
| Spread half-life | 5–60 days | 60–120 days | > 120 days or < 5 |
| 信号 | 健康状态 | 警告 | 停止交易 |
|---|---|---|---|
| 滚动p值 | < 0.05 | 0.05–0.10 | > 0.10 |
| 对冲比率漂移 | 变化<10% | 变化10–25% | 变化>25% |
| 价差半衰期 | 5–60天 | 60–120天 | >120天或<5天 |
Crypto Pairs Candidates
加密资产对候选
Layer-1 Correlation
第一层区块链相关性
- SOL vs ETH — L1 sector beta, often cointegrated during trending markets
- SOL vs AVAX — alternative L1 correlation
- SOL vs ETH — L1板块贝塔系数,在趋势市场中通常是cointegrated
- SOL vs AVAX — 替代L1相关性
Stablecoins
稳定币
- USDC vs USDT — should be perfectly cointegrated (peg arbitrage)
- Useful as a sanity check for your cointegration pipeline
- USDC vs USDT — 应完全cointegrated(挂钩套利)
- 可用作Cointegration管道的 sanity check
Liquid Staking Derivatives
流动性质押衍生品
- mSOL vs jitoSOL — both track SOL staking yield
- stSOL vs mSOL — Lido vs Marinade staking
- mSOL vs jitoSOL — 均跟踪SOL质押收益
- stSOL vs mSOL — Lido vs Marinade质押
Same-Sector Tokens
同板块代币
- DEX tokens: RAY vs ORCA
- Lending tokens: cross-protocol comparison
- Meme tokens: rarely cointegrated, high risk
- DEX代币:RAY vs ORCA
- 借贷代币:跨协议比较
- Meme代币:很少cointegrated,高风险
Common Pitfalls
常见陷阱
-
Spurious cointegration — Two trending series (both up in a bull market) may appear cointegrated. Always test on sufficient data (>200 observations) and check out-of-sample stability.
-
Structural breaks — A fundamental change (protocol upgrade, tokenomics change) can permanently break cointegration. Monitor rolling p-values.
-
Look-ahead bias — Estimating the hedge ratio on the full sample and then backtesting on the same sample inflates results. Always use walk-forward estimation.
-
Too-short sample — Cointegration tests need >100 observations minimum, ideally >200, to have reasonable power.
-
Ignoring transaction costs — Pairs trades involve 4 transactions per round trip. At 0.3% per leg, that is 1.2% in costs that the spread must overcome.
-
Asymmetric cointegration — The relationship may only hold in one direction or one regime. Consider threshold cointegration models for production use.
-
虚假Cointegration — 两个趋势序列(牛市中均上涨)可能看似cointegrated。务必使用足够的数据(>200个观测值)进行测试,并检查样本外稳定性。
-
结构性断裂 — 根本性变化(协议升级、通证经济变化)可能永久打破Cointegration。监控滚动p值。
-
前瞻偏差 — 在全样本上估计对冲比率,然后在同一样本上回测会夸大结果。务必使用滚动向前估计。
-
样本过短 — Cointegration测试至少需要>100个观测值,理想情况下>200个,才能具有合理的效力。
-
忽略交易成本 — 配对交易每轮涉及4笔交易。每笔手续费0.3%的话,总成本是1.2%,价差必须覆盖这部分成本才能获利。
-
不对称Cointegration — 关系可能仅在一个方向或一种市场制度下成立。生产环境中可考虑阈值Cointegration模型。
Integration with Other Skills
与其他技能集成
- — Pre-screening pairs by correlation before cointegration testing
correlation-analysis - — Trading the cointegrated spread using mean-reversion entry/exit rules
mean-reversion - — Backtesting pairs strategies with walk-forward validation
vectorbt - — Identifying when cointegration regimes shift
regime-detection - — Spread volatility forecasting for dynamic position sizing
volatility-modeling
- — 在Cointegration测试前通过相关性预筛选资产对
correlation-analysis - — 使用均值回归入场/离场规则交易cointegrated价差
mean-reversion - — 通过滚动向前验证回测配对策略
vectorbt - — 识别Cointegration制度何时转变
regime-detection - — 价差波动率预测用于动态仓位 sizing
volatility-modeling
Files
文件
References
参考资料
- — Engle-Granger details, Johansen derivation, hedge ratio estimation methods, spread construction
references/methodology.md - — Entry/exit rules, risk management, performance metrics, crypto-specific considerations
references/pairs_trading.md
- — Engle-Granger细节、Johansen推导、对冲比率估计方法、价差构建
references/methodology.md - — 入场/离场规则、风险管理、绩效指标、加密资产特定考虑因素
references/pairs_trading.md
Scripts
脚本
- — Full cointegration test pipeline with ADF, Hurst, half-life, rolling stability, and demo mode
scripts/test_cointegration.py - — Walk-forward pairs trading backtest with synthetic data and performance reporting
scripts/pairs_backtest.py
- — 完整的Cointegration测试管道,包含ADF、Hurst、半衰期、滚动稳定性和演示模式
scripts/test_cointegration.py - — 滚动向前配对交易回测,包含合成数据和绩效报告
scripts/pairs_backtest.py