Every instruction passes through three main stages: fetching from memory, decoding to determine operation and operand addresses, and executing the operation. This cycle forms the heartbeat of the processor.
Trace a sample instruction (e.g., ADD R1, R2, R3) through each stage, observing which control signals activate and how data flows.
Different instructions may have different cycle counts in real processors. Memory fetch and execution are not always single-cycle operations.
Every program you run is ultimately a sequence of binary-encoded instructions sitting in memory. The processor's job is to work through them one at a time using a repeating three-phase rhythm: fetch, decode, and execute. This cycle is so fundamental that it defines what a processor *does* — without it, the instruction set architecture you already know would be just a specification with no engine to run it.
In the fetch phase, the processor reads the next instruction from the memory address stored in the program counter (PC). Think of the PC as a bookmark in a recipe book — it tells the processor exactly which instruction to read next. The instruction bytes are loaded into a special holding register called the instruction register (IR), and the PC increments to point at the following instruction. This increment happens automatically, which is why programs execute sequentially by default. Branch and jump instructions work by overwriting the PC with a different address, breaking the sequential flow.
During decode, the processor examines the bits in the instruction register to figure out what operation to perform and which operands to use. The opcode field identifies the operation (add, load, branch, etc.), and the remaining fields specify registers or memory addresses. The control unit translates this encoding into internal signals: which ALU operation to select, which registers to read, whether memory should be accessed. If you think of the instruction as a sentence, decoding is parsing it into verb, subject, and object so the processor knows what action to take and on what data.
The execute phase carries out the decoded operation. For an arithmetic instruction like ADD R1, R2, R3, this means routing the values from registers R2 and R3 into the ALU, performing the addition, and writing the result back to R1. For a load instruction, it means computing a memory address, sending it to the memory system, and storing the returned data in a register. For a branch, it means evaluating a condition and potentially updating the PC. Each type of instruction exercises a different path through the datapath hardware, but the three-phase cycle structure remains the same.
In the simplest processor designs, each of these phases takes one clock cycle, and the processor completes one instruction every three cycles before starting the next. Real processors complicate this picture considerably — some instructions need multiple cycles for memory access or complex computation, and techniques like pipelining overlap the phases of consecutive instructions to improve throughput. But the fetch-decode-execute cycle remains the conceptual backbone. Every optimization in processor design is ultimately about making this cycle run faster or overlap more efficiently.