Foundations: Baselines and classical models
Start with naive and seasonal-naive baselines, then move into moving averages and ARIMA/ETS. This gives a
stable benchmark before adding modern architectures that can hide failure modes.
Baseline (naive) -> Smoothed baseline (MA) -> Classical (ARIMA/ETS) -> Decomposition review
Further Reading
Modern forecasting: Multi-horizon and interpretable deep models
Train models to emit full forward paths, not single points. Pair sequence encoders with interpretable attention
and feature importance so commercial stakeholders can audit forecast drivers.
Features -> Encoder/Decoder -> Multi-horizon quantiles -> Attention attributions -> Decision layer
Further Reading
Evaluation: Rolling validation and leakage controls
Use rolling windows with strict time ordering, then report point and interval metrics side by side. Validate
that feature snapshots and joins do not leak future information into training or evaluation windows.
Train window -> Validate on next slice -> Roll forward -> Refit -> Compare metrics and calibration
Further Reading
AWS Forecast metrics and Quantura point forecast update
The AWS Forecast metrics guide is a strong reference for reading backtests because it lays out how Forecast
scores predictors across backtest windows and how it separates quantile forecasts from point-style operating
decisions. It covers wQL, Average wQL, WAPE, RMSE, MAPE, and MASE, and the practical reading is simple: lower values
indicate better predictors across those reported metrics.
- Average wQL is the mean of the selected quantile losses, with AWS defaulting to 0.10, 0.50, and 0.90.
- wQL is quantile-specific and useful when underprediction and overprediction have different business costs.
- WAPE, RMSE, MAPE, and MASE summarize overall error from the backtest windows rather than a single quantile slice.
- AWS forecast types can include
mean plus up to five custom quantiles from 0.01 through 0.99 via the ForecastTypes parameter.
Backtest windows -> Quantile metrics (wQL, Average wQL) + point metrics (WAPE, RMSE, MAPE, MASE) -> forecast type selection -> operating decision
AWS documentation makes two important points: you can request mean as a forecast type, and you can
also use a quantile as the effective point forecast when the cost of underpredicting differs from the cost of
overpredicting. That is why AWS supports mixed forecast-type selections such as 0.01,
mean, 0.65, and 0.99.
In Quantura's SageMaker Canvas stock workflow, the exported prediction files are quantile-first and do not
actually expose a separate mean column, so the operative point forecast is the quantile in the middle. In
practice that middle column is usually P50 or 0.5, and it becomes the anchor for the
business-day comparison logic.
The Quantura point forecast update is: take the first actionable business-day middle quantile in the forecast
round and the last valid business-day middle quantile in the forecast, compare them at the second
significant figure, and for most stocks you can round those values to the tens place. If the last middle
quantile is lower than the first middle quantile at that significance level, use the upper bound as the
operative point level. Otherwise use the lower bound. That keeps the decision anchored to the middle quantile
while still adapting to the direction implied by the business-day path.
Further Reading
Signal engineering: Demand inflection features
Convert forecast curves into actionable features: first and second derivative (delta, acceleration), surprise
versus realized values, and regime filters that suppress low-confidence transitions.
Forecast curve -> Delta/acceleration -> Surprise vs actual -> Regime filter -> Signal score
Further Reading
MLOps: CI/CD and continuous training
Treat model updates as software releases with deterministic builds, test gates, staged promotion, and automated
rollback triggers for unstable live performance.
Commit -> Validate data/features -> Train -> Evaluate gates -> Register -> Deploy canary -> Promote
Further Reading
MLOps: Data validation and model monitoring
Monitor feature distributions, input quality, prediction drift, and business KPIs together. Trigger retraining
only when quality gates or drift thresholds are breached.
Incoming data -> Quality checks -> Drift/skew checks -> Alerts -> Retraining trigger policy
Further Reading
MLOps: Experiment tracking, registry, lineage
Every model should be reproducible from data snapshot to deployment hash. Link experiment IDs to model registry
entries and keep inference lineage for governance and postmortem debugging.
Dataset version + code commit + params -> Run ID -> Registered model -> Deployed endpoint -> Live metrics
Further Reading
GitHub Discussion
Comments powered by GitHub Discussions. Share follow-up reading, architecture notes, or implementation feedback.