A system uses Q8.8 fixed-point format (8 integer bits, 8 fractional bits). The stored integer value is 640. What real number does this represent?
A640.0 — the stored value is the real value
B2.5 — divide by 2⁸ = 256 to convert from stored integer to real number
C160.0 — divide by 2² = 4 because there are 2 fractional bits per byte
D0.0025 — multiply by 2⁻⁸ twice because the format has 8 fractional bits on each side
In Q8.8 format, the real value equals the stored integer divided by 2⁸ = 256, because the binary point sits 8 positions from the right. 640 / 256 = 2.5. You can verify: 2 in binary is 00000010.00000000, which is stored as 512; 0.5 in binary is 0.10000000, which is stored as 128; 512 + 128 = 640. The stored integer is not the real value — it is the real value scaled up by 2⁸. This scaling convention is the core of fixed-point representation.
Question 2 Multiple Choice
When would fixed-point arithmetic be preferred over floating-point arithmetic?
AWhen the program needs to represent very large and very small numbers simultaneously
BWhen maximum numerical precision is required regardless of magnitude
CWhen hardware simplicity, low power consumption, or deterministic timing matters more than dynamic range — such as in embedded systems, DSP, or motor control
DWhen the range of values is unpredictable at design time
Fixed-point is preferred when you can commit to a known value range at design time and when a floating-point unit is unacceptable due to cost, power, or timing requirements. Fixed-point arithmetic uses the standard integer ALU — no dedicated FPU is needed. This is why early gaming hardware (original PlayStation), audio DSPs, and microcontrollers often use fixed-point. When the value range is unpredictable or spans many orders of magnitude, floating-point is the better choice — it adjusts precision based on magnitude. Fixed-point's uniform precision is a feature when the range is known, a liability when it isn't.
Question 3 True / False
In a fixed-point number system, the position of the binary point is stored explicitly in each number so that the hardware knows how to interpret the bits.
TTrue
FFalse
Answer: False
The binary point position is implicit — an agreed-upon convention that exists only in the programmer's interpretation of the bit pattern, not in the stored data itself. The hardware stores and manipulates an ordinary integer; it has no awareness of where the 'decimal point' lies. This is both the strength and the danger of fixed-point: arithmetic is simple (just integer operations), but if two values with different binary point positions are added without alignment, the result is silently wrong. Floating-point, by contrast, stores the exponent explicitly, allowing the hardware to handle numbers at different scales automatically.
Question 4 True / False
Fixed-point arithmetic provides uniform precision across all representable values, unlike floating-point which has higher precision near zero.
TTrue
FFalse
Answer: True
In a Q8.8 format, the spacing between adjacent representable values is always 1/256 ≈ 0.004, whether the value is near 0 or near 255. This uniform spacing means the absolute error of any fixed-point value is at most 0.002 (half the spacing). Floating-point sacrifices this uniformity for dynamic range: it is highly precise near zero (small exponent, many significant mantissa bits available) and increasingly imprecise for large values. For applications where uniform error distribution matters — such as audio sample arithmetic — fixed-point's regularity is an advantage.
Question 5 Short Answer
Why must programmers perform 'scaling analysis' before using fixed-point arithmetic, and what can go wrong if they skip it?
Think about your answer, then reveal below.
Model answer: Scaling analysis is the process of choosing the binary point position (the Q format) by analyzing the full expected range of every variable in the computation. The chosen format must represent the maximum value without overflow and the minimum meaningful value without losing it to rounding. If the range is underestimated, computed values overflow silently — the upper bits wrap around and produce garbage. If the format uses too many integer bits, fractional precision is wasted on representing a range that never occurs. Unlike floating-point, fixed-point gives no warning when values go out of range; the programmer must guarantee correctness by design.
This discipline is why fixed-point code is harder to write correctly than floating-point code, even though the hardware is simpler. Every intermediate calculation must be tracked through its possible range. Multiplying two Q8.8 values produces a Q16.16 intermediate result that must be right-shifted to return to Q8.8, and the programmer must ensure the pre-shift value fits in the available bits. In audio and DSP applications, scaling errors produce audible artifacts or silent corruption — which is why fixed-point software requires careful mathematical analysis, not just implementation.