algo-forecast-arima
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseARIMA Time Series Model
ARIMA时间序列模型
Overview
概述
ARIMA(p,d,q) combines autoregression (AR), differencing (I), and moving average (MA) for time series forecasting. Seasonal variant: SARIMA(p,d,q)(P,D,Q,s). Requires stationary data (achieved through differencing). Best for univariate series with clear trend/seasonality patterns.
ARIMA(p,d,q)结合自回归(AR)、差分(I)和移动平均(MA)进行时间序列预测。季节性变体为SARIMA(p,d,q)(P,D,Q,s)。模型要求数据具有平稳性(可通过差分实现)。最适用于具有清晰趋势/季节性模式的单变量序列。
When to Use
适用场景
Trigger conditions:
- Forecasting univariate time series (sales, demand, traffic)
- Data has clear trend and/or seasonal patterns
- Need interpretable model with statistical properties
When NOT to use:
- For multivariate forecasting with many external features (use ML models)
- For very long-range forecasts (ARIMA confidence intervals widen rapidly)
- For irregular/event-driven data (use causal models)
触发条件:
- 单变量时间序列预测(销售额、需求、流量等)
- 数据具有清晰的趋势和/或季节性模式
- 需要具备统计属性的可解释模型
不适用场景:
- 包含大量外部特征的多变量预测(建议使用机器学习模型)
- 超长期预测(ARIMA的置信区间会迅速扩大)
- 不规则/事件驱动型数据(建议使用因果模型)
Algorithm
算法流程
IRON LAW: ARIMA Requires STATIONARY Data
Non-stationary data (trend, changing variance) violates ARIMA assumptions.
Test stationarity with ADF test (p < 0.05 = stationary).
If non-stationary: difference the series (d=1 usually suffices).
If still non-stationary after d=2, ARIMA may not be appropriate.IRON LAW: ARIMA Requires STATIONARY Data
Non-stationary data (trend, changing variance) violates ARIMA assumptions.
Test stationarity with ADF test (p < 0.05 = stationary).
If non-stationary: difference the series (d=1 usually suffices).
If still non-stationary after d=2, ARIMA may not be appropriate.Phase 1: Input Validation
阶段1:输入验证
Check: regular time intervals, no missing values (impute if needed), minimum 50 observations (ideally 2+ full seasonal cycles). Test stationarity with ADF test.
Gate: Data is regular, sufficient length, stationarity assessed.
检查:时间间隔规则、无缺失值(如有需要可插补)、至少50个观测值(理想情况为2个以上完整季节性周期)。使用ADF检验平稳性。
**准入条件:**数据规则、长度充足、已评估平稳性。
Phase 2: Core Algorithm
阶段2:核心算法
- Stationarity: ADF test. If p > 0.05, difference (d=1). Retest.
- Parameter selection: Examine ACF/PACF plots. Or use auto_arima (AIC-based grid search).
- p (AR terms): PACF cutoff lag
- q (MA terms): ACF cutoff lag
- d: number of differences needed
- Fit model: Maximum likelihood estimation
- Forecast: Generate predictions with confidence intervals
- 平稳性:执行ADF检验。若p值>0.05,则进行差分(d=1),重新检验。
- 参数选择:分析ACF/PACF图,或使用auto_arima(基于AIC的网格搜索)。
- p(自回归项):PACF截断滞后阶数
- q(移动平均项):ACF截断滞后阶数
- d:所需的差分次数
- 拟合模型:使用极大似然估计法
- 生成预测:输出带有置信区间的预测结果
Phase 3: Verification
阶段3:验证
Check residuals: should be white noise (no autocorrelation). Ljung-Box test (p > 0.05 = no autocorrelation). Residuals normally distributed.
Gate: Residuals pass Ljung-Box test, no remaining patterns.
检查残差:应满足白噪声特性(无自相关性)。执行Ljung-Box检验(p>0.05表示无自相关性)。残差需符合正态分布。
**准入条件:**残差通过Ljung-Box检验,无剩余模式。
Phase 4: Output
阶段4:输出
Return forecasts with confidence intervals.
返回带有置信区间的预测结果。
Output Format
输出格式
json
{
"forecasts": [{"period": "2025-04", "forecast": 1250, "lower_95": 1100, "upper_95": 1400}],
"model": {"order": [1,1,1], "seasonal_order": [1,1,1,12], "aic": 520.3},
"metadata": {"training_periods": 60, "forecast_horizon": 12}
}json
{
"forecasts": [{"period": "2025-04", "forecast": 1250, "lower_95": 1100, "upper_95": 1400}],
"model": {"order": [1,1,1], "seasonal_order": [1,1,1,12], "aic": 520.3},
"metadata": {"training_periods": 60, "forecast_horizon": 12}
}Examples
示例
Sample I/O
示例输入输出
Input: 12 monthly observations with upward trend: [10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32]
Step 1: First difference = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] (constant → stationary, d=1 sufficient)
Step 2: ARIMA(0,1,0) random walk with drift μ=2 is the simplest fitting model.
Expected forecast (ARIMA(0,1,0) with drift=2):
- Period 13: 32 + 2 = 34
- Period 14: 32 + 4 = 36
- Period 15: 32 + 6 = 38
Verify: differenced series is constant (2) → no AR/MA terms needed. Residuals are exactly 0 → perfect fit (toy example). On real data, residuals should pass Ljung-Box (p > 0.05).
**输入:**12个呈上升趋势的月度观测值:[10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32]
**步骤1:**一阶差分结果 = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2](恒定值→平稳,d=1即可满足要求)
**步骤2:**ARIMA(0,1,0)带漂移项μ=2的随机游走是最适配的简单模型。
预期预测结果(ARIMA(0,1,0),漂移项=2):
- 第13期:32 + 2 = 34
- 第14期:32 + 4 = 36
- 第15期:32 + 6 = 38
验证:差分序列为恒定值(2)→无需AR/MA项。残差完全为0→完美拟合(示例数据)。在真实数据中,残差需通过Ljung-Box检验(p>0.05)。
Edge Cases
边缘情况
| Input | Expected | Why |
|---|---|---|
| No trend, no seasonality | ARIMA(p,0,q) | No differencing needed |
| Strong trend only | ARIMA(p,1,q) | Single difference removes linear trend |
| Multiple seasonalities | ARIMA may struggle | Consider Prophet or TBATS instead |
| 输入 | 预期模型 | 原因 |
|---|---|---|
| 无趋势、无季节性 | ARIMA(p,0,q) | 无需差分 |
| 仅存在强趋势 | ARIMA(p,1,q) | 一次差分即可消除线性趋势 |
| 多重季节性 | ARIMA可能表现不佳 | 建议使用Prophet或TBATS模型 |
Gotchas
注意事项
- Over-differencing: d=2 when d=1 suffices introduces unnecessary noise. Check if first difference is stationary before differencing again.
- Auto-ARIMA isn't magic: AIC-based selection can pick overfit models. Always check residual diagnostics regardless of auto selection.
- Confidence intervals widen fast: Multi-step forecasts accumulate uncertainty. Don't trust point forecasts beyond 2-3 seasonal cycles.
- Calendar effects: Business days, holidays, and leap years affect monthly/weekly data. ARIMA doesn't handle these natively — add regressors or use Prophet.
- Structural breaks: ARIMA assumes the data-generating process is stable. COVID, market shocks, or policy changes break this assumption.
- 过度差分:当d=1即可满足时使用d=2会引入不必要的噪声。再次差分前,先检查一阶差分序列是否平稳。
- Auto-ARIMA并非万能:基于AIC的参数选择可能选出过拟合模型。无论是否自动选择参数,始终要检查残差诊断结果。
- 置信区间快速扩大:多步预测会累积不确定性。不要信任超出2-3个季节性周期的点预测结果。
- 日历效应:工作日、节假日和闰年会影响月度/周度数据。ARIMA无法原生处理此类情况——可添加回归项或使用Prophet模型。
- 结构突变:ARIMA假设数据生成过程稳定。疫情、市场冲击或政策变化会打破这一假设。
References
参考文献
- For ACF/PACF interpretation guide, see
references/acf-pacf.md - For SARIMA seasonal parameter selection, see
references/seasonal-arima.md
- ACF/PACF解读指南,请查看
references/acf-pacf.md - SARIMA季节性参数选择,请查看
references/seasonal-arima.md