Questions: Deep Q-Networks (DQN)

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A student implements DQN but omits experience replay, training directly on consecutive game transitions in order. They observe the agent rapidly learns one section of the game but then 'forgets' earlier patterns. What does this illustrate?

AThe neural network has insufficient capacity to memorize the entire game's state space
BConsecutive game frames are highly correlated, causing the network to overfit to recent experience and catastrophically forget lessons from earlier states
CThe target network is updating too slowly to keep pace with the rapidly changing policy
DConvolutional layers cannot generalize across different game screen positions without replay
Question 2 Multiple Choice

What problem does the DQN target network solve, and how?

AIt provides extra training data by generating synthetic rollouts when real experience is sparse
BIt prevents Q-values from diverging to infinity by clamping the maximum target value to a fixed scale
CIt stabilizes learning by providing temporarily stationary targets: a frozen copy of the network computes training targets, updated only periodically so the Q-network learns toward a stable objective
DIt ensures exploration by generating random actions until the main network's Q-values converge
Question 3 True / False

DQN can learn directly from raw pixel inputs because convolutional layers extract spatial features that the fully connected output layers map to per-action Q-values.

TTrue
FFalse
Question 4 True / False

Without experience replay, DQN would still converge because the Q-learning update rule is mathematically designed to handle correlated sequential observations.

TTrue
FFalse
Question 5 Short Answer

Why was combining neural networks with Q-learning notoriously unstable before DQN, and which two innovations made it tractable?

Think about your answer, then reveal below.