Chapter 4 · Part 2

Taste as coordinates

Comparing entire rows of likes (Chapter 3) gets slow and noisy fast. The modern fix is the same beautiful trick behind word embeddings: give every user and every item a short list of numbers — a vector — placing them in a shared "taste space" where closeness predicts preference.

If you took the embeddings course, this will feel familiar. Here the points aren't words; they're people and things they might watch — but they live in the same kind of space, and nearness still means similarity.

Scroll to drop items and yourself into a learned map of taste.

Every item gets coordinates in a hidden space — no one hand-labels the axes.

scroll↓

Latent factors: axes nobody names

This comes from matrix factorization. You start with that giant, sparse user–item matrix and "factor" it into two skinny tables: one row of numbers per user, one per item. Each number is a latent factor — a learned dimension of taste. Crucially, nobody labels them. The model invents whatever axes best explain the ratings; we read meaning into them afterward ("ah, this axis looks like calm-vs-intense").

Why this is such a leap

Two payoffs over raw collaborative filtering:

Speed & scale. Comparing two short vectors is instant; you can score millions of items against your one vector in real time.
Generalization. Because taste is compressed into a few factors, the model can relate a user and an item that share no direct co-likes — as long as they land near each other in the space. It fills the sparse matrix gracefully.

This is also the bridge to today's deep recommenders: those swap the simple dot product for a neural network and add features, but the core — learn a vector for everything, compare with a dot product — is exactly this.

mf.py — predicting from learned taste vectors

# After training, every user and item has a learned vector.
p_user = user_vectors[user_id]      # e.g. [0.8, -0.3, 0.5, ...]
q_item = item_vectors[item_id]      # e.g. [0.7, -0.1, 0.6, ...]

score = p_user @ q_item             # dot product = predicted rating

# Recommend: the items whose vectors score highest with yours.
top = sorted(items, key=lambda i: p_user @ item_vectors[i], reverse=True)[:k]

Where we're headed

Taste vectors are powerful — but only once the model has seen enough of your ratings to place your point. On day one, with zero history, your vector is meaningless. So what does the feed show a brand-new user? Next: the cold-start problem.