Questions — Deep Q-Networks (DQN)

Question 1 Multiple Choice

A student implements DQN but omits experience replay, training directly on consecutive game transitions in order. They observe the agent rapidly learns one section of the game but then 'forgets' earlier patterns. What does this illustrate?

AThe neural network has insufficient capacity to memorize the entire game's state space

BConsecutive game frames are highly correlated, causing the network to overfit to recent experience and catastrophically forget lessons from earlier states

CThe target network is updating too slowly to keep pace with the rapidly changing policy

DConvolutional layers cannot generalize across different game screen positions without replay

Question 2 Multiple Choice

What problem does the DQN target network solve, and how?

AIt provides extra training data by generating synthetic rollouts when real experience is sparse

BIt prevents Q-values from diverging to infinity by clamping the maximum target value to a fixed scale

CIt stabilizes learning by providing temporarily stationary targets: a frozen copy of the network computes training targets, updated only periodically so the Q-network learns toward a stable objective

DIt ensures exploration by generating random actions until the main network's Q-values converge

Question 3 True / False

DQN can learn directly from raw pixel inputs because convolutional layers extract spatial features that the fully connected output layers map to per-action Q-values.

TTrue

FFalse

Question 4 True / False

Without experience replay, DQN would still converge because the Q-learning update rule is mathematically designed to handle correlated sequential observations.

TTrue

FFalse

Question 5 Short Answer

Why was combining neural networks with Q-learning notoriously unstable before DQN, and which two innovations made it tractable?

Think about your answer, then reveal below.

Questions: Deep Q-Networks (DQN)