A music platform releases a brand-new song with no play history. A pure collaborative filtering system is asked to generate recommendations involving this song. What fundamental problem does this illustrate?
AThe platform cannot compute audio features for the new song without a content-based component
BThe cold-start problem: collaborative filtering has no interaction patterns to leverage for an item that no user has rated, so it cannot generate recommendations involving that item
CThe sparsity problem: the new song adds a sparse row to the user-item matrix, degrading overall similarity calculations
DThe dimensionality problem: the new song's latent factor vector cannot be initialized without user history
Collaborative filtering works entirely from the pattern of who liked what. A brand-new item has no ratings, so there are no patterns to leverage — the system is blind to it. This is the cold-start problem and is an inherent limitation of the approach. Note that options C and D describe related issues but miss the core point: sparsity and dimensionality problems affect existing items too, but the cold-start problem specifically means the item literally cannot be recommended at all.
Question 2 Multiple Choice
Matrix factorization handles the sparsity problem in collaborative filtering primarily by:
AFilling in missing ratings with each item's average rating before computing user similarities
BRemoving users and items with fewer than a minimum number of interactions to reduce noise
CLearning low-rank latent factor vectors that must generalize coherently across the entire matrix, preventing memorization of sparse observations
DRequiring explicit user feedback before including new items in the factorization
The key insight is that a low-rank factorization R ≈ UV^T forces generalization. Because the rank is much smaller than the full matrix dimensions, the model cannot independently memorize each of the few observed entries — it must find latent patterns that explain many entries simultaneously. These patterns (latent factors) fill in unobserved cells not by imputation but by interpolation from learned structure. Options A and B are preprocessing heuristics, not the mechanism by which factorization solves sparsity.
Question 3 True / False
Item-based collaborative filtering tends to be more stable than user-based collaborative filtering in practice because item similarity patterns change less frequently than user similarity patterns.
TTrue
FFalse
Answer: True
Items are fixed artifacts — a movie's genre, pacing, and appeal do not change over time. User tastes and behavior evolve as they age, discover new interests, or change life circumstances. This means the similarity matrix between items is relatively stable and can be precomputed, while user-user similarities must be recomputed frequently. Item-based CF also scales better when there are fewer items than users, which is common on large platforms.
Question 4 True / False
Collaborative filtering improves recommendations by combining user interaction patterns with item content features such as genre, description, or attributes.
TTrue
FFalse
Answer: False
This describes a hybrid recommender system, not pure collaborative filtering. The defining characteristic of collaborative filtering is that it ignores item features entirely — it works solely from the pattern of who rated what. This is both its strength (it can discover unexpected connections that content analysis would miss) and its weakness (it cannot handle new items with no ratings, even if those items have rich content metadata).
Question 5 Short Answer
Why does collaborative filtering work at all, given that it ignores what items actually are or what users explicitly say they want?
Think about your answer, then reveal below.
Model answer: Collaborative filtering exploits the empirical regularity that people with similar taste histories tend to have similar future preferences. If two users have agreed on dozens of items in the past, their shared pattern of agreement is more predictive than any feature analysis. The interaction matrix encodes implicit information about latent dimensions of taste — without needing to name or understand those dimensions. The algorithm discovers structure in behavior rather than structure in content.
This is the philosophical core of the approach. Content-based systems rely on explicit feature engineering (someone must decide that genre and director matter). Collaborative filtering is agnostic about why people agree — it just finds that they do. Matrix factorization makes this concrete: the latent factors are learned, not designed. This is why CF can surface recommendations that content analysis would never generate — it finds patterns that transcend any hand-crafted feature space.