Questions: Cross-Correlation Applications and Time Delay Estimation
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A sonar system records the same reflected pulse at two hydrophones placed 1.5 m apart. You cross-correlate the two recordings and find the peak at lag τ = 1 ms. The speed of sound in water is 1,500 m/s. What does this tell you?
AThe target is 1.5 m from the nearer hydrophone
BThe reflected pulse traveled 1.5 m farther to reach the second hydrophone than the first, providing a path-length difference useful for triangulation
CThe two hydrophone signals are maximally similar when one is shifted 1 ms into the future — this means they are out of phase
DThe target is moving at 1,500 m/s toward the hydrophones
The cross-correlation peak at τ = 1 ms means the signal arrived at the second hydrophone 1 ms later than at the first. At 1,500 m/s, this corresponds to a path-length difference of 1,500 × 0.001 = 1.5 m. This difference, along with the known hydrophone separation, can be used to compute the arrival angle and triangulate the target's position. The cross-correlation peak gives you the time delay, not the absolute distance — you need additional geometry to locate the target.
Question 2 Multiple Choice
What is the key mathematical difference between cross-correlation R_xy(τ) and convolution (x * y)(t)?
ACross-correlation multiplies signal amplitudes; convolution adds them
BCross-correlation slides one signal without time-reversing it; convolution time-reverses one signal before sliding
CConvolution works only in continuous time; cross-correlation works only in discrete time
DCross-correlation requires both signals to have the same energy; convolution does not
Convolution is defined as (x * y)(t) = ∫ x(τ) y(t − τ) dτ — one signal is time-reversed (y(t−τ)) before being slid. Cross-correlation is R_xy(τ) = ∫ x(t) y(t + τ) dt — y is slid forward without reversal. This makes convolution the right tool for computing system outputs (impulse response + input), while cross-correlation is the right tool for measuring similarity as a function of lag. They are related by R_xy(τ) = x(−τ) * y(τ), which means FFT-based convolution algorithms can compute correlation directly.
Question 3 True / False
Normalized cross-correlation produces values between −1 and +1, regardless of the absolute amplitudes of the two signals being compared.
TTrue
FFalse
Answer: True
Normalization divides by the product of the two signals' RMS amplitudes (or energies), which cancels out any amplitude scaling. The result measures only shape similarity: +1 means perfect positive match, −1 means perfect negative match (one is an inverted copy of the other), and 0 means no linear similarity at that lag. This is why normalized cross-correlation is standard for template matching — a template that appears at different brightness levels or signal amplitudes in the data is still found at the same peak location.
Question 4 True / False
The location of the cross-correlation peak between two sensor recordings tells you which recording has greater signal energy.
TTrue
FFalse
Answer: False
The peak *location* (the lag τ at which the peak occurs) identifies the time delay between the two signals — it tells you by how much one signal is shifted relative to the other. It says nothing about which has more energy. The peak *magnitude* (unnormalized) does depend on both signals' energies and their similarity, but still does not separate energy from similarity. To compare energy, you'd compute each signal's autocorrelation at zero lag, not their cross-correlation.
Question 5 Short Answer
How does the peak of the cross-correlation function enable time-delay estimation between two signals, and why is this useful?
Think about your answer, then reveal below.
Model answer: Cross-correlation R_xy(τ) = ∫ x(t) y(t + τ) dt measures how similar x and y are when y is shifted by lag τ. When x and y are two recordings of the same event (e.g., a sound pulse arriving at two microphones), the function is maximized when the shift τ exactly compensates for the travel-time difference between the two sensors — at that lag, the two recordings line up best. Locating the peak of R_xy(τ) therefore directly estimates the time delay between arrival at the two sensors. This delay, combined with the known sensor geometry and signal propagation speed, gives the direction or distance to the source.
The underlying intuition is geometric: the cross-correlation peak answers the question 'by how many samples must I shift signal y to make it look most like signal x?' That shift is the time delay. The usefulness is broad: sonar and radar use it to locate targets, GPS receivers use it to synchronize with satellite signals, and audio engineers use it to align multi-microphone recordings. The FFT-based computation makes it feasible even for long signals — O(N log N) instead of the naive O(N²).