algo-forecast-arima

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ARIMA Time Series Model

ARIMA时间序列模型

Overview

概述

ARIMA(p,d,q) combines autoregression (AR), differencing (I), and moving average (MA) for time series forecasting. Seasonal variant: SARIMA(p,d,q)(P,D,Q,s). Requires stationary data (achieved through differencing). Best for univariate series with clear trend/seasonality patterns.
ARIMA(p,d,q)结合自回归(AR)、差分(I)和移动平均(MA)进行时间序列预测。季节性变体为SARIMA(p,d,q)(P,D,Q,s)。模型要求数据具有平稳性(可通过差分实现)。最适用于具有清晰趋势/季节性模式的单变量序列。

When to Use

适用场景

Trigger conditions:
  • Forecasting univariate time series (sales, demand, traffic)
  • Data has clear trend and/or seasonal patterns
  • Need interpretable model with statistical properties
When NOT to use:
  • For multivariate forecasting with many external features (use ML models)
  • For very long-range forecasts (ARIMA confidence intervals widen rapidly)
  • For irregular/event-driven data (use causal models)
触发条件:
  • 单变量时间序列预测(销售额、需求、流量等)
  • 数据具有清晰的趋势和/或季节性模式
  • 需要具备统计属性的可解释模型
不适用场景:
  • 包含大量外部特征的多变量预测(建议使用机器学习模型)
  • 超长期预测(ARIMA的置信区间会迅速扩大)
  • 不规则/事件驱动型数据(建议使用因果模型)

Algorithm

算法流程

IRON LAW: ARIMA Requires STATIONARY Data
Non-stationary data (trend, changing variance) violates ARIMA assumptions.
Test stationarity with ADF test (p < 0.05 = stationary).
If non-stationary: difference the series (d=1 usually suffices).
If still non-stationary after d=2, ARIMA may not be appropriate.
IRON LAW: ARIMA Requires STATIONARY Data
Non-stationary data (trend, changing variance) violates ARIMA assumptions.
Test stationarity with ADF test (p < 0.05 = stationary).
If non-stationary: difference the series (d=1 usually suffices).
If still non-stationary after d=2, ARIMA may not be appropriate.

Phase 1: Input Validation

阶段1:输入验证

Check: regular time intervals, no missing values (impute if needed), minimum 50 observations (ideally 2+ full seasonal cycles). Test stationarity with ADF test. Gate: Data is regular, sufficient length, stationarity assessed.
检查:时间间隔规则、无缺失值(如有需要可插补)、至少50个观测值(理想情况为2个以上完整季节性周期)。使用ADF检验平稳性。 **准入条件:**数据规则、长度充足、已评估平稳性。

Phase 2: Core Algorithm

阶段2:核心算法

  1. Stationarity: ADF test. If p > 0.05, difference (d=1). Retest.
  2. Parameter selection: Examine ACF/PACF plots. Or use auto_arima (AIC-based grid search).
    • p (AR terms): PACF cutoff lag
    • q (MA terms): ACF cutoff lag
    • d: number of differences needed
  3. Fit model: Maximum likelihood estimation
  4. Forecast: Generate predictions with confidence intervals
  1. 平稳性:执行ADF检验。若p值>0.05,则进行差分(d=1),重新检验。
  2. 参数选择:分析ACF/PACF图,或使用auto_arima(基于AIC的网格搜索)。
    • p(自回归项):PACF截断滞后阶数
    • q(移动平均项):ACF截断滞后阶数
    • d:所需的差分次数
  3. 拟合模型:使用极大似然估计法
  4. 生成预测:输出带有置信区间的预测结果

Phase 3: Verification

阶段3:验证

Check residuals: should be white noise (no autocorrelation). Ljung-Box test (p > 0.05 = no autocorrelation). Residuals normally distributed. Gate: Residuals pass Ljung-Box test, no remaining patterns.
检查残差:应满足白噪声特性(无自相关性)。执行Ljung-Box检验(p>0.05表示无自相关性)。残差需符合正态分布。 **准入条件:**残差通过Ljung-Box检验,无剩余模式。

Phase 4: Output

阶段4:输出

Return forecasts with confidence intervals.
返回带有置信区间的预测结果。

Output Format

输出格式

json
{
  "forecasts": [{"period": "2025-04", "forecast": 1250, "lower_95": 1100, "upper_95": 1400}],
  "model": {"order": [1,1,1], "seasonal_order": [1,1,1,12], "aic": 520.3},
  "metadata": {"training_periods": 60, "forecast_horizon": 12}
}
json
{
  "forecasts": [{"period": "2025-04", "forecast": 1250, "lower_95": 1100, "upper_95": 1400}],
  "model": {"order": [1,1,1], "seasonal_order": [1,1,1,12], "aic": 520.3},
  "metadata": {"training_periods": 60, "forecast_horizon": 12}
}

Examples

示例

Sample I/O

示例输入输出

Input: 12 monthly observations with upward trend: [10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32]
Step 1: First difference = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] (constant → stationary, d=1 sufficient)
Step 2: ARIMA(0,1,0) random walk with drift μ=2 is the simplest fitting model.
Expected forecast (ARIMA(0,1,0) with drift=2):
  • Period 13: 32 + 2 = 34
  • Period 14: 32 + 4 = 36
  • Period 15: 32 + 6 = 38
Verify: differenced series is constant (2) → no AR/MA terms needed. Residuals are exactly 0 → perfect fit (toy example). On real data, residuals should pass Ljung-Box (p > 0.05).
**输入:**12个呈上升趋势的月度观测值:[10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32]
**步骤1:**一阶差分结果 = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2](恒定值→平稳,d=1即可满足要求)
**步骤2:**ARIMA(0,1,0)带漂移项μ=2的随机游走是最适配的简单模型。
预期预测结果(ARIMA(0,1,0),漂移项=2):
  • 第13期:32 + 2 = 34
  • 第14期:32 + 4 = 36
  • 第15期:32 + 6 = 38
验证:差分序列为恒定值(2)→无需AR/MA项。残差完全为0→完美拟合(示例数据)。在真实数据中,残差需通过Ljung-Box检验(p>0.05)。

Edge Cases

边缘情况

InputExpectedWhy
No trend, no seasonalityARIMA(p,0,q)No differencing needed
Strong trend onlyARIMA(p,1,q)Single difference removes linear trend
Multiple seasonalitiesARIMA may struggleConsider Prophet or TBATS instead
输入预期模型原因
无趋势、无季节性ARIMA(p,0,q)无需差分
仅存在强趋势ARIMA(p,1,q)一次差分即可消除线性趋势
多重季节性ARIMA可能表现不佳建议使用Prophet或TBATS模型

Gotchas

注意事项

  • Over-differencing: d=2 when d=1 suffices introduces unnecessary noise. Check if first difference is stationary before differencing again.
  • Auto-ARIMA isn't magic: AIC-based selection can pick overfit models. Always check residual diagnostics regardless of auto selection.
  • Confidence intervals widen fast: Multi-step forecasts accumulate uncertainty. Don't trust point forecasts beyond 2-3 seasonal cycles.
  • Calendar effects: Business days, holidays, and leap years affect monthly/weekly data. ARIMA doesn't handle these natively — add regressors or use Prophet.
  • Structural breaks: ARIMA assumes the data-generating process is stable. COVID, market shocks, or policy changes break this assumption.
  • 过度差分:当d=1即可满足时使用d=2会引入不必要的噪声。再次差分前,先检查一阶差分序列是否平稳。
  • Auto-ARIMA并非万能:基于AIC的参数选择可能选出过拟合模型。无论是否自动选择参数,始终要检查残差诊断结果。
  • 置信区间快速扩大:多步预测会累积不确定性。不要信任超出2-3个季节性周期的点预测结果。
  • 日历效应:工作日、节假日和闰年会影响月度/周度数据。ARIMA无法原生处理此类情况——可添加回归项或使用Prophet模型。
  • 结构突变:ARIMA假设数据生成过程稳定。疫情、市场冲击或政策变化会打破这一假设。

References

参考文献

  • For ACF/PACF interpretation guide, see
    references/acf-pacf.md
  • For SARIMA seasonal parameter selection, see
    references/seasonal-arima.md
  • ACF/PACF解读指南,请查看
    references/acf-pacf.md
  • SARIMA季节性参数选择,请查看
    references/seasonal-arima.md