Questions — Time Series Forecasting

Question 1 Multiple Choice

A data scientist randomly splits two years of hourly sales data 80/20 into train and test sets, trains an LSTM, and reports excellent test accuracy. What is the fundamental problem?

ALSTMs are not appropriate for sales data — a simpler ARIMA model should have been used

BThe test set likely contains timestamps from before the end of the training set, so the model was effectively trained on 'future' information it would not have in production

CAn 80/20 split does not provide enough training data for a neural network

DSales data is too noisy for any forecasting model to achieve high accuracy

Question 2 Multiple Choice

A naive baseline that predicts the last observed value ('predict t+1 = t') outperforms a carefully tuned LSTM on a stationary demand series. What does this most likely indicate?

ALSTMs are computationally too slow for real-time forecasting

BThe naive baseline is guaranteed to have lower RMSE by mathematical construction

CThe series has strong autocorrelation and is well-behaved; the LSTM likely overfit to noise in training, failing to add value over a simple recency heuristic

DLSTMs require more than one year of training data to outperform naive baselines

Question 3 True / False

Computing normalization statistics (mean and standard deviation) over the entire dataset — including the test period — before splitting is a valid preprocessing step for time series forecasting.

TTrue

FFalse

Question 4 True / False

Walk-forward (rolling-origin) validation is more appropriate than k-fold cross-validation for evaluating time series forecasting models.

TTrue

FFalse

Question 5 Short Answer

Explain why non-stationarity (trend and seasonality) must be diagnosed before fitting a classical forecasting model, and what happens if it is ignored.

Think about your answer, then reveal below.

Questions: Time Series Forecasting