Questions — Monte Carlo Tree Search

Question 1 Multiple Choice

A student claims MCTS evaluates board positions using a heuristic function, just like minimax. What actually happens during the simulation (rollout) phase of MCTS?

AMCTS calls a neural network to evaluate the position and assign a score

BMCTS uses the same alpha-beta pruning as minimax but in a randomized order

CMCTS plays the game out to completion using a random or lightly guided policy, then uses the win/loss result as the evaluation

DMCTS applies a domain-specific heuristic to estimate the probability of winning from the position

Question 2 Multiple Choice

After running MCTS for a fixed time budget, how do you select which move to actually play?

AChoose the child of the root with the highest UCB1 score, since that balances value and exploration

BChoose the child of the root that was visited most often, as visit count reflects accumulated confidence

CChoose the child of the root with the highest average reward, regardless of visit count

DChoose a random child, weighted by each child's average reward

Question 3 True / False

MCTS requires that the game be played to a terminal state before any useful information is obtained, meaning it can seldom return a move recommendation until the search is complete.

TTrue

FFalse

Question 4 True / False

The UCB1 formula in MCTS selects moves with high average reward but also adds an exploration bonus for moves that have been visited less often.

TTrue

FFalse

Question 5 Short Answer

Explain how the UCB formula in MCTS prevents the algorithm from permanently ignoring a move that happened to lose in its first few simulations.

Think about your answer, then reveal below.

Questions: Monte Carlo Tree Search