A CPU pipeline stage needs to read two source registers and write one destination register all within the same clock cycle. Is this possible with a standard register file?
ANo — writes must complete in one cycle and reads in the next; operations cannot overlap
BYes — reads are combinational (no clock edge needed), so both reads and the write can proceed simultaneously within the same cycle
CYes, but only if the reads and write access different physical registers in the file
DNo — register file operations are always serialized to prevent data hazards
This is the key asymmetry of register file design. Reads are purely combinational: supplying an address to a multiplexer tree immediately routes the corresponding register's output to the read port — no clock edge required. Writes require one clock edge to latch the value. This means a read can happen at any point in a clock cycle, while a write completes at the cycle boundary. Pipeline designers exploit this to read source operands at the beginning of a cycle and write the destination at the end — both within a single cycle.
Question 2 Multiple Choice
In a register file, how does the write logic ensure that only the targeted register is updated when a write occurs?
AThe write data is broadcast to all registers, and each register compares it to its current value before deciding to update
BA decoder converts the write address to a one-hot enable signal, activating exactly one register's clock input while all others ignore the incoming data
CThe write port serializes the update across all registers in sequence, stopping when the correct address is matched
DA priority encoder selects the highest-address register that has been idle longest
The decoder is the key mechanism. A write address of, say, 3 bits can address 8 registers. The decoder converts this 3-bit address into an 8-bit one-hot signal where exactly one bit is high. Only the register with that enable line active will latch the incoming write data when the clock edge arrives. All others see their clock enable as low and hold their current value unchanged. This is clean, fast, and parallel — all 8 registers see the write data, but only one acts on it.
Question 3 True / False
A register file with two read ports requires two independent multiplexer trees so that both source operands can be accessed at the same time.
TTrue
FFalse
Answer: True
Each read port has its own address input and its own independent multiplexer tree that routes from the register outputs to the port output. The two trees operate in parallel, so two different register addresses can be presented simultaneously and both outputs become available within the same combinational delay. This is the standard design for a CPU datapath: ALU instructions take two source operands (rs1 and rs2), so two read ports allow fetching both simultaneously rather than sequentially.
Question 4 True / False
Register files are kept small compared to caches primarily because they use slower, denser memory cells that require fewer transistors per bit.
TTrue
FFalse
Answer: False
The causality is reversed. Register files are fast *because* they are small — not the other way around. Each read port requires its own multiplexer tree that spans all registers; as the number of registers grows, the multiplexer tree grows with it, increasing both area and delay. Register files use fast flip-flop-based storage (same as caches), not denser but slower cells. The constraint on size comes from the cost of the addressing and multiplexing logic, not from slower storage technology.
Question 5 Short Answer
Why are reads from a register file described as 'combinational' while writes require a clock edge, and what practical benefit does this asymmetry provide?
Think about your answer, then reveal below.
Model answer: Reads are combinational because the register file's read logic is just wires and multiplexers — supply an address, and the correct register's output is immediately routed to the read port with no state change and no clock edge needed. Writes require a clock edge because they update state: the incoming data must be latched into flip-flops, and flip-flops only update on a clock edge. The practical benefit is that pipeline stages can read source operands at the beginning of a cycle and write the result at the end of the same cycle, enabling full-throughput pipelining without stalling for register access.
This asymmetry is fundamental to how pipelined processors achieve one-instruction-per-cycle throughput (under ideal conditions). If reads also required a clock cycle, a typical three-operand instruction would need at least three cycles just for register access. The combinational read collapses that latency to near-zero within a cycle, leaving the full cycle available for computation. It's one of the key design choices that makes modern pipelines fast.