Questions — Object Detection Networks

Question 1 Multiple Choice

Two object detection systems are benchmarked: System A runs at 4 FPS with 87% mean average precision (mAP); System B runs at 50 FPS with 76% mAP. Which architectural family most likely corresponds to each?

AA: YOLO-style single-shot detector; B: Faster R-CNN two-stage detector

BA: Faster R-CNN two-stage detector; B: YOLO-style single-shot detector

CA: R-CNN with selective search; B: SSD single-shot detector

DA: sliding-window CNN classifier; B: Faster R-CNN with Feature Pyramid Network

Question 2 Multiple Choice

A detector produces 18 overlapping bounding boxes around the same cat in an image, all with varying confidence scores. What technique selects the single best prediction and discards the rest?

AFeature Pyramid Network (FPN), which merges multi-scale features into one prediction

BRegion Proposal Network (RPN), which filters out redundant proposals before classification

CNon-maximum suppression (NMS), which keeps the highest-confidence box and removes overlapping duplicates

DAnchor box matching, which assigns each object to exactly one grid cell

Question 3 True / False

In Faster R-CNN, the convolutional backbone processes the image only once, and the resulting feature map is shared between the Region Proposal Network and the classification head.

TTrue

FFalse

Question 4 True / False

Object detection is fundamentally equivalent to running an image classifier on a sliding window at nearly every possible location and scale, making it a straightforward extension of image classification.

TTrue

FFalse

Question 5 Short Answer

Explain why Feature Pyramid Networks (FPN) are used in object detection, and what problem they solve that a single feature map from the last convolutional layer cannot handle.

Think about your answer, then reveal below.

Questions: Object Detection Networks