A Markov chain models a system where every transition has a fixed probability. An MDP (Markov decision process) additionally has nondeterministic choices. Why is the MDP model more expressive?
Think about your answer, then reveal below.
Model answer: A Markov chain has a single probability distribution over the next state from any state. An MDP allows nondeterminism: from a state, there may be multiple possible actions, each with its own probability distribution. This models systems controlled by both a scheduler (making nondeterministic choices about which action to take) and randomness (once an action is chosen, its outcome is probabilistic). For example, a communication protocol might nondeterministically choose to retry (scheduler choice) and then probabilistically succeed or fail (randomness). MDP model checking computes the worst-case (minimum) and best-case (maximum) probabilities over all possible scheduler choices, answering 'what is the worst that could happen?' and 'what is the best that could happen?'
The nondeterminism in MDPs captures the interplay between controllable and uncontrollable behaviors. A scheduler is an adversary or controller: in game-theoretic verification, you ask 'can the scheduler ensure property P?', answering in the affirmative if a scheduler exists that guarantees P. In synthesis, you find the scheduler that achieves the best probability. This is more expressive than pure probabilistic systems but computationally harder: model checking MDPs is NP-hard, while Markov chains can be checked in polynomial time.
Question 2 Short Answer
In probabilistic temporal logic, the formula P(φ ≥ 0.95) means 'the probability of φ is at least 95%.' For a Markov chain, this is evaluated as: compute the probability of φ, check if it is ≥ 0.95. For an MDP, what does this formula mean?
Think about your answer, then reveal below.
Model answer: For an MDP, P(φ ≥ 0.95) means: 'there exists a scheduler such that the probability of φ is at least 95%.' Equivalently, the maximum probability (best-case under optimal scheduling) of φ is ≥ 0.95. Dually, P(φ ≤ 0.05) means the minimum probability (worst-case, over all schedulers) is ≤ 0.05. MDP model checking computes both the maximum and minimum probabilities of a property over all possible schedulers, giving bounds on what could happen. This is essential for systems where some choices are adversarial (worst-case analysis) or cooperative (best-case analysis).
The existential/universal quantification over schedulers is the key difference. In a Markov chain, there is one probability; in an MDP, there is a range (minimum to maximum) depending on the scheduler. Verification problems include: 'can you guarantee at least 95% success?', answered by checking if the maximum probability is ≥ 0.95, and 'is there a scheduler that might lead to failure?', answered by checking if the minimum probability of success is < 100%.
Question 3 Multiple Choice
Rewards in probabilistic model checking allow reasoning about expected values beyond just probabilities. A system might have a property: 'the expected time to recover from failure is at most 10 seconds.' How would you verify this?
AAssign a time cost to each transition and use model checking to compute the expected cost (sum of costs on all paths, weighted by path probability). Check if this expected value is ≤ 10
BSimulate the system 1000 times and average the recovery times
CFormally verify the system cannot fail
DMeasure the system on a test run
Reward model checking extends probabilistic model checking by assigning numeric values (rewards or costs) to states and transitions. Transitions might have time costs; states might have resource usage costs. The model checker computes the expected total reward on paths satisfying a property (e.g., paths from 'failure' to 'recovered'). The expectation is weighted by path probability: if a path has 30% probability and costs 8 seconds, it contributes 0.3 × 8 to the expectation. Model checking sums over all paths, computing the exact expected cost. This is much more precise than simulation (which requires many runs) and gives absolute guarantees (within the model's accuracy).
Question 4 Short Answer
Probabilistic model checking requires solving systems of equations. For a Markov chain computing the probability of reaching a goal state, the equations are linear (probability of reaching goal from state s = sum over next states of (transition probability * probability from next state)). Why does this enable efficient checking?
Think about your answer, then reveal below.
Model answer: Linear systems of equations can be solved in polynomial time using Gaussian elimination or iterative methods like value iteration. For a Markov chain with n states, solving the system takes O(n^3) time, making probabilistic model checking scalable. In contrast, model checking nonprobabilistic systems typically involves NP-hard problems (SAT solving). The linearity of probabilistic systems is a computational advantage: even though you're computing probabilities over exponentially many paths, the recursive structure allows polynomial-time solution.
This is a counterintuitive advantage of probabilistic systems: adding randomness makes the verification problem easier (polynomial) rather than harder. The reason is that linear equations are tractable, whereas Boolean satisfiability (underlying standard model checking) is NP-hard. Modern probabilistic model checkers like PRISM exploit this: they build an automaton representing all paths, then set up and solve a system of linear equations, computing exact probabilities in polynomial time. For MDPs, the problem is harder (NP-hard, requiring game-theoretic reasoning) but still much more tractable than standard model checking of complex systems.