Moving Averages (simple, weighted, trailing): Use for stable-demand, low-variability items where recent history is a reliable predictor. A 4-week simple moving average works for commodity staples. Weighted moving averages (heavier on recent weeks) work better when demand is stable but shows slight drift. Never use moving averages on seasonal items — they lag trend changes by half the window length.
Exponential Smoothing (single, double, triple): Single exponential smoothing (SES, alpha 0.1–0.3) suits stationary demand with noise. Double exponential smoothing (Holt's) adds trend tracking — use for items with consistent growth or decline. Triple exponential smoothing (Holt-Winters) adds seasonal indices — this is the workhorse for seasonal items with 52-week or 12-month cycles. The alpha/beta/gamma parameters are critical: high alpha (>0.3) chases noise in volatile items; low alpha (<0.1) responds too slowly to regime changes. Optimize on holdout data, never on the same data used for fitting.
Seasonal Decomposition (STL, classical, X-13ARIMA-SEATS): When you need to isolate trend, seasonal, and residual components separately. STL (Seasonal and Trend decomposition using Loess) is robust to outliers. Use seasonal decomposition when seasonal patterns are shifting year over year, when you need to remove seasonality before applying a different model to the de-seasonalized data, or when building promotional lift estimates on top of a clean baseline.
Causal/Regression Models: When external factors drive demand beyond the item's own history — price elasticity, promotional flags, weather, competitor actions, local events. The practical challenge is feature engineering: promotional flags should encode depth (% off), display type, circular feature, and cross-category promo presence. Overfitting on sparse promo history is the single biggest pitfall. Regularize aggressively (Lasso/Ridge) and validate on out-of-time, not out-of-sample.
Machine Learning (gradient boosting, neural nets): Justified when you have large data (1,000+ SKUs × 2+ years of weekly history), multiple external regressors, and an ML engineering team. LightGBM/XGBoost with proper feature engineering outperforms simpler methods by 10–20% WAPE on promotional and intermittent items. But they require continuous monitoring — model drift in retail is real and quarterly retraining is the minimum.
Moving Averages(简单、加权、滚动): 适用于需求稳定、波动小的商品,这类商品的近期历史数据是可靠的预测依据。4周简单移动平均适用于大宗商品 staples。加权移动平均(给近期数据更高权重)更适合需求稳定但存在小幅波动的商品。不要对季节性商品使用移动平均法——这类方法的趋势变化滞后性等于窗口长度的一半。
Exponential Smoothing(单指数、双指数、三指数): 单指数平滑(SES,alpha值0.1-0.3)适用于带噪声的平稳需求。双指数平滑(Holt法)增加了趋势跟踪能力——适用于需求持续增长或下降的商品。三指数平滑(Holt-Winters法)增加了季节性指数——是处理52周或12个月周期季节性商品的主力方法。alpha/beta/gamma参数至关重要:高alpha值(>0.3)会捕捉波动商品的噪声,低alpha值(<0.1)对需求结构变化的响应太慢。要在留存数据集上优化参数,不要在用于拟合的同一数据集上优化。
Seasonal Decomposition(STL、经典法、X-13ARIMA-SEATS): 当你需要单独分离趋势、季节性和残差分量时使用。STL(基于Loess的季节性和趋势分解)对异常值具有鲁棒性。当季节性模式逐年变化、需要先去除季节性再对去季节化数据应用其他模型,或者需要在干净的基线之上构建促销增量估算时,使用季节性分解法。
因果/回归模型: 当需求由商品自身历史之外的外部因素驱动时使用,包括价格弹性、促销标识、天气、竞争对手动作、本地活动等。实际应用中的挑战是特征工程:促销标识需要编码折扣力度(降价百分比)、陈列类型、传单露出和跨品类促销存在情况。稀疏促销历史上的过拟合是最大的陷阱。要积极做正则化(Lasso/Ridge),并在跨时间的验证集上验证,而不是随机拆分的样本外验证集。
Machine Learning(梯度提升、神经网络): 当你拥有大量数据(1000+SKU × 2年以上周度历史数据)、多个外部回归因子,并且有ML工程团队支撑时使用。经过合理特征工程的LightGBM/XGBoost在促销和间歇性需求商品上的WAPE表现比简单方法好10-20%。但它们需要持续监控——零售场景的模型漂移是真实存在的,最少每季度要重新训练一次。