Collaborative Filtering

Graduate Depth 62 in the knowledge graph I know this Set as goal
collaborative-filtering matrix-factorization user-item

Core Idea

Collaborative filtering predicts preferences by finding patterns in user-item interaction matrices. User-based approaches find similar users; item-based find similar items. Matrix factorization decomposes the interaction matrix into latent user and item factors. The core assumption is that similar users like similar items.

Explainer

From your study of recommendation systems, you know the basic goal: predict whether a user will like an item they haven't interacted with yet. Content-based approaches do this by analyzing item features (genre, description, attributes), but collaborative filtering takes a radically different approach — it ignores item features entirely and works solely from the pattern of who liked what. The fundamental insight is that if two users agreed on many items in the past, they are likely to agree on items in the future. You don't need to know *why* they liked those items — the agreement pattern is enough.

User-based collaborative filtering implements this directly. To predict whether user A will like a movie, find the users most similar to A (based on their shared rating history), then average those similar users' ratings for the movie in question. Similarity is typically measured by cosine similarity or Pearson correlation across the ratings vector. Item-based collaborative filtering flips the perspective: instead of finding similar users, it finds items similar to ones user A already liked (where "similar" means they tend to be rated similarly by the same users) and predicts ratings based on those item similarities. Item-based approaches tend to be more stable in practice because item similarity patterns change less frequently than user similarity patterns, and they scale better when there are fewer items than users.

Both approaches face a critical problem: the user-item matrix is extremely sparse. A typical platform might have millions of users and hundreds of thousands of items, but each user has interacted with only a tiny fraction — often less than 1% of all items. Computing similarities from such sparse vectors is noisy and unreliable. Matrix factorization addresses this by decomposing the sparse user-item matrix R into two smaller dense matrices: a user matrix U (each row is a user's latent factor vector) and an item matrix V (each row is an item's latent factor vector), such that R ≈ UV^T. Each latent factor captures an abstract dimension of taste — perhaps one factor corresponds roughly to "preference for action vs. drama" and another to "tolerance for long runtime," though the factors are learned automatically and are not always interpretable.

The elegance of matrix factorization is that predicting user i's rating for item j becomes simply the dot product of their latent vectors: r̂ᵢⱼ = uᵢ · vⱼ. Training learns U and V by minimizing the prediction error on observed ratings (often with regularization to prevent overfitting to the sparse data). This approach, famously used by the winning entry in the Netflix Prize, handles sparsity gracefully because the low-rank factorization forces the model to generalize — it cannot simply memorize the few observed entries but must find coherent latent patterns that explain the entire matrix. The trade-off is the cold-start problem: collaborative filtering cannot recommend items that no one has rated yet, or make predictions for brand-new users with no history, since there are no interaction patterns to leverage.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesLiteral EquationsSlope-Intercept FormPoint-Slope FormWriting Linear EquationsParallel and Perpendicular Line SlopesGraphing Linear EquationsSystems of Equations — Graphing MethodSystems of Equations — Elimination MethodSystems of Three VariablesMatrices IntroductionLinear TransformationsEigenvalues and EigenvectorsDiagonalizationPrincipal Component AnalysisDimensionality Reduction TechniquesCollaborative Filtering

Longest path: 63 steps · 320 total prerequisite topics

Prerequisites (2)

Leads To (0)

No topics depend on this one yet.