Questions: Monte Carlo Tree Search

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A student claims MCTS evaluates board positions using a heuristic function, just like minimax. What actually happens during the simulation (rollout) phase of MCTS?

AMCTS calls a neural network to evaluate the position and assign a score
BMCTS uses the same alpha-beta pruning as minimax but in a randomized order
CMCTS plays the game out to completion using a random or lightly guided policy, then uses the win/loss result as the evaluation
DMCTS applies a domain-specific heuristic to estimate the probability of winning from the position
Question 2 Multiple Choice

After running MCTS for a fixed time budget, how do you select which move to actually play?

AChoose the child of the root with the highest UCB1 score, since that balances value and exploration
BChoose the child of the root that was visited most often, as visit count reflects accumulated confidence
CChoose the child of the root with the highest average reward, regardless of visit count
DChoose a random child, weighted by each child's average reward
Question 3 True / False

MCTS requires that the game be played to a terminal state before any useful information is obtained, meaning it can seldom return a move recommendation until the search is complete.

TTrue
FFalse
Question 4 True / False

The UCB1 formula in MCTS selects moves with high average reward but also adds an exploration bonus for moves that have been visited less often.

TTrue
FFalse
Question 5 Short Answer

Explain how the UCB formula in MCTS prevents the algorithm from permanently ignoring a move that happened to lose in its first few simulations.

Think about your answer, then reveal below.