Users who watch classic arthouse films on a streaming platform tend to also watch a certain recently released foreign film — even though the new film has no obvious genre, director, or stylistic similarities to the arthouse classics. The platform uses this pattern to recommend the new film to users who have watched arthouse movies. Which recommendation approach is this?
AContent-based filtering, because the recommendation is based on the user's film preferences
BCollaborative filtering, because the recommendation is based on patterns of user-item interactions without using item features
CA hybrid system, because it requires both item features and user interaction data
DPopularity-based filtering, because the recommendation reflects what many users watch
Collaborative filtering ignores item features entirely — it relies purely on who-liked-what patterns. If users who watched arthouse films also watched this new film, the system recommends it to similar users regardless of what the new film is 'about.' This is the defining characteristic of collaborative filtering: it finds users with similar interaction histories and leverages those similarities. Content-based filtering (option A) would instead look at the new film's genre, director, and other features and recommend it only if those features match a user's preference model. The absence of any feature matching in this example identifies it as collaborative.
Question 2 Multiple Choice
A recommendation system has just been deployed for a new e-commerce platform. The product catalog has 500,000 items. On Day 1, only 200 users have signed up and each has purchased exactly one item. Which challenge most fundamentally limits the system's ability to make good recommendations?
AScalability — 500,000 items is too many to rank for each user in real time
BCold-start — with almost no user interaction history, collaborative filtering cannot find similar users or score unrated items
CData sparsity — users have only rated 1 item each, making the interaction matrix sparse
DFilter bubble — the system will only recommend items similar to what users already bought
Cold-start is the most fundamental limitation here. Collaborative filtering works by finding users with similar interaction histories — but with only one purchase each, there is almost no signal to identify similarity between users. The system cannot determine which users are 'like' each other, and new items with no interactions cannot be scored at all. Data sparsity (option C) is related but distinct: sparsity describes an ongoing condition in all recommendation systems, while cold-start is the extreme case where there is essentially no history to work with. Scalability (option A) is a real production challenge but not the fundamental issue at Day 1. Filter bubble (option D) is a content-based filtering problem.
Question 3 True / False
A collaborative filtering system can recommend a movie to a user even if the system has never analyzed what that movie is about — its genre, director, themes, or cast.
TTrue
FFalse
Answer: True
This is the defining feature of collaborative filtering: it operates entirely on the user-item interaction matrix (ratings, clicks, purchases) without any representation of item content. Two users who agreed on movies in the past are predicted to agree on future movies — the system infers a notion of 'similarity' from behavioral patterns alone. This is both a strength (it can discover non-obvious connections between items) and a weakness (it cannot score items that have never been interacted with). In contrast, content-based filtering requires item feature representations to function. Understanding this distinction clarifies when each approach is appropriate and what hybrid systems must combine.
Question 4 True / False
A recommendation system that achieves lower RMSE (root mean squared error) on held-out ratings will reliably produce better recommendations than one with higher RMSE, because users care most about accurate rating predictions.
TTrue
FFalse
Answer: False
RMSE measures how accurately a system predicts the exact rating a user would give an item (e.g., predicting 3.8 vs. actual 4.0). But users care about which items appear in the top 5–10 recommendations, not about the precise numerical scores. A system that accurately predicts the difference between a 3-star and 4-star rating but misranks the top items is less useful than one that correctly identifies the top 10 recommendations even if its predicted scores are numerically imprecise. Ranking metrics like precision@k, recall@k, and NDCG (normalized discounted cumulative gain) measure what users actually experience. The Netflix Prize famously optimized for RMSE, but winners reported that the RMSE-optimal models weren't necessarily the most useful in practice.
Question 5 Short Answer
Why do large-scale recommendation systems typically use a two-stage architecture — first a retrieval stage, then a ranking stage — rather than scoring all items with a single model for each user query?
Think about your answer, then reveal below.
Model answer: Scoring every item in a catalog of millions with an expensive model would take too long for real-time recommendations (inference must happen in milliseconds). The retrieval stage uses a fast, approximate method — such as approximate nearest neighbor search on user and item embeddings — to narrow millions of candidates down to hundreds in microseconds. The ranking stage then applies a more expensive, accurate model to those hundreds of candidates, applying richer features and more complex interactions. This two-stage design achieves the speed needed for real-time inference without sacrificing ranking quality for the items that actually appear in the recommendations.
This architecture also allows the two stages to be optimized independently and updated at different frequencies. Retrieval can use embedding similarity to surface a diverse, plausible set of candidates; ranking can use user context, real-time signals, business rules (e.g., promoted items), and heavy neural models. The trade-off is that items rejected by the retrieval stage can never appear in recommendations, so retrieval recall matters as much as ranking precision.