Questions: ARIMA Models and Time Series Forecasting
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A monthly employment series trends persistently upward over decades. A researcher wants to fit an ARIMA model. Which transformation should be applied first, and why?
AApply a log transformation to normalize the variance
BFirst-difference the series to remove the trend and achieve stationarity before modeling
CDetrend by regressing on time, then fit an AR model to the residuals
DApply the model directly — ARIMA handles trends internally without preprocessing
The 'I' in ARIMA stands for 'integrated' — it refers to differencing the series d times until it is stationary. A trending series violates the stationarity assumption required for AR and MA components. Taking the first difference (change from period to period) often removes a deterministic or stochastic trend. Option D is the tempting wrong answer: ARIMA does handle non-stationarity, but it does so *through* the differencing step, which must be applied explicitly. Fitting AR or MA components directly to a non-stationary level series produces spurious results. Option C (deterministic detrending) is a different, less general approach that assumes a fixed linear trend rather than a stochastic one.
Question 2 Multiple Choice
An AR(1) model fits an economic series well, but the Ljung-Box test shows significant autocorrelation in the residuals at lag 1. Adding an MA(1) term eliminates the remaining autocorrelation. Why does adding the MA component help here?
AThe MA term increases the model's degrees of freedom, automatically reducing autocorrelation
BThe MA term captures decay of shock effects: past forecast errors are influencing current values, which the AR term alone cannot model
CThe AR and MA terms together always produce white noise residuals regardless of the data
DThe MA term removes the need for the differencing step by absorbing the trend
AR and MA components capture two distinct memory mechanisms. AR says the current value depends on past *values* of the series — persistent autocorrelation in levels. MA says the current value depends on past *forecast errors* (shocks) — the decay of one-time disturbances. If a shock (e.g., a policy announcement) affects the series for a few periods before fading, the AR term alone will leave residual autocorrelation because it cannot model this shock-decay pattern efficiently. Adding MA terms addresses precisely this. The residual autocorrelation after fitting AR(1) is the diagnostic signal that shock effects are present. Option A is wrong because degrees of freedom alone do not eliminate autocorrelation structure.
Question 3 True / False
An ARIMA model can be applied directly to a non-stationary series because the model's parameters automatically adjust to account for trends.
TTrue
FFalse
Answer: False
False. ARIMA handles non-stationarity through the 'd' (differencing) parameter, which must be explicitly chosen and applied before fitting the AR and MA components. If you attempt to fit AR and MA terms to a non-stationary series without differencing (d = 0 when d should be 1 or more), the OLS estimates of the AR parameters will be biased and inconsistent — a result related to the spurious regression problem. The Box-Jenkins methodology specifically requires testing for stationarity (using ADF or KPSS tests), then differencing d times until the series is stationary, and only then estimating the ARMA(p,q) components.
Question 4 True / False
Over-differencing a time series — applying more differences than needed to achieve stationarity — can introduce spurious autocorrelation into an otherwise clean series.
TTrue
FFalse
Answer: True
True. If a series is already stationary after first differencing (d = 1), taking a second difference (d = 2) creates a new series whose values are defined in terms of the original series at lags 1 and 2. This introduces a moving-average unit root into the differenced series, creating artificial autocorrelation structure that was not present in the correctly-differenced data. The model then needs extra MA terms to soak up this induced pattern — leading to a more complex and less interpretable model. The principle is: difference only as many times as needed to achieve stationarity, verified through formal tests, not automatically or excessively.
Question 5 Short Answer
Explain what the MA component adds to an AR model, and describe a real-world situation where including MA terms would be important.
Think about your answer, then reveal below.
Model answer: The MA component models dependence on past forecast errors (shocks) rather than past values of the series itself. An MA(q) term means today's value is influenced by the residuals of the last q periods. This captures situations where a random disturbance — a sudden policy shock, weather event, or one-time disruption — has effects that decay gradually over subsequent periods. An AR model cannot capture this efficiently because it expresses current values in terms of current levels, not the history of surprises.
A concrete example: after a central bank unexpectedly raises interest rates (a shock), economic activity may be suppressed for several months before recovering to trend. An AR model would need many lags to approximate this decay pattern, while a single MA term can represent it parsimoniously. In practice, the ACF and PACF patterns guide selection: a pure MA(q) process shows autocorrelation that cuts off abruptly after lag q, while an AR process shows PACF that cuts off. Mixed ARMA processes show gradual decay in both — which is why the identification stage using these tools matters.