Questions: Phantom Read Anomaly: New Rows Appearing
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
Transaction A runs SELECT * FROM orders WHERE status = 'pending' and gets 50 rows. Transaction B then inserts a new pending order and commits. Transaction A runs the same query again and gets 51 rows. Which isolation level, at minimum, prevents this anomaly?
ARead committed — it only reads committed data
BRepeatable read — it locks all rows read by the transaction
CSerializable — it locks the predicate, preventing new matching rows from being inserted
DRead uncommitted — it blocks all concurrent writes
Repeatable read prevents modification of rows already read, but it cannot prevent new rows from being inserted. The phantom row (row 51) didn't exist when Transaction A first read — there was nothing to lock. Serializable isolation prevents phantoms by locking the predicate (the condition 'status = pending'), so no new row matching that condition can be inserted while Transaction A holds its lock. Read committed and read uncommitted provide even weaker guarantees and cannot prevent phantom reads.
Question 2 Multiple Choice
A banking system sums all transactions for an account within a transaction, then uses that sum to decide whether to approve a transfer. A phantom read occurs mid-transaction. What is the WORST CASE outcome?
AThe sum query fails with an error, forcing a retry
BThe transfer decision is made on a stale sum that doesn't reflect a concurrent deposit, potentially approving an incorrect amount
CThe transaction is automatically rolled back by the database
DThe sum is recalculated automatically to include the new row
Phantom reads are dangerous precisely because they are silent — no error is raised, no rollback occurs. The first query returned a correct sum at that moment; the decision logic proceeds as if that sum is definitive. But a new transaction committed between the two reads, making the sum stale. The database does not alert the transaction to this inconsistency under repeatable read or weaker isolation. This is why serializable isolation exists for financial applications where result set completeness matters.
Question 3 True / False
Repeatable read isolation prevents phantom reads because it locks most rows that match a query's WHERE clause.
TTrue
FFalse
Answer: False
Repeatable read locks rows that ALREADY EXIST and have been read — it prevents those specific rows from being modified by other transactions. But a phantom is a NEW row that didn't exist when the first read occurred. You cannot place a lock on a row that doesn't exist yet. Preventing phantoms requires locking the predicate (the WHERE condition itself), which is what serializable isolation provides. This is the architectural gap that repeatable read cannot bridge.
Question 4 True / False
Preventing phantom reads requires locking not just existing data rows, but the predicate — the condition that defines which rows qualify — so that new qualifying rows cannot be inserted.
TTrue
FFalse
Answer: True
This is the essential insight. Phantom prevention is fundamentally about locking the absence of data — the 'space' where new matching rows could appear. Implementations vary: some databases use predicate locks (explicit locks on WHERE conditions), others use index-range locks (locking ranges in an index that covers the predicate), and others use serializable snapshot isolation (detecting conflicts at commit time). All approaches must prevent another transaction from inserting a row that retroactively satisfies a predicate already evaluated.
Question 5 Short Answer
Why can't row-level locking, even if perfectly implemented, prevent phantom reads? What additional mechanism is required?
Think about your answer, then reveal below.
Model answer: Row-level locking can only apply to rows that already exist. A phantom is a row inserted by a concurrent transaction after the first read — there was no row to lock at the time of the first query. Preventing phantoms requires locking the predicate (the WHERE condition), so that any INSERT attempting to create a row matching that condition is blocked until the reading transaction completes. This requires predicate locking, index-range locking, or conflict detection at commit time (as in serializable snapshot isolation).
The conceptual key is that row-level locks protect existing data; phantoms exploit the absence of data. A transaction that read 'all rows where status = pending' has implicitly made a claim about the complete set — but row locks only secure the members of that set at that moment, not the set's boundaries. Serializable isolation closes this gap by treating the predicate itself as a lockable resource, at the cost of reduced concurrency.