Chapter 6 · Part 3

The loop you're in

Every chapter so far treated your taste as something the system reads. The unsettling truth is that it also writes it. The feed shows you things → you react → those reactions train the next feed → which shapes what you react to next. It's a feedback loop, and you're inside it.

Run that loop while optimizing purely for engagement and something predictable happens: the feed discovers one or two topics you reliably click, and — because that maximizes the score — shows you more and more of them, and less of everything else. That narrowing is the filter bubble.

Scroll to run the loop and watch a varied feed collapse toward a single topic.

The loop: your feed → you watch → signals → the model → your feed again.

scroll

Why the bubble forms

It's not a conspiracy — it's the math doing exactly what we asked. If the score is "probability you engage," then the safest bet is always more of what already worked. Three forces compound:

  • Rich-get-richer: popular items get shown more, so they get more engagement, so they're shown even more.
  • Self-fulfilling taste: the model only learns from what it chose to show you; topics it never surfaces look like things you "don't like," even if you would.
  • Engagement ≠ wellbeing: outrage, cliffhangers and doomscrolling are engaging, so a pure-engagement objective quietly over-weights them.

Explore vs exploit

The cure is borrowed from the explore–exploit tradeoff. Exploiting means serving the sure thing; exploring means occasionally showing something uncertain to learn whether you'd like it. Too little exploration and you're trapped in the bubble; too much and the feed feels random. Good systems spend most of the budget exploiting and a deliberate slice exploring.

explore.py — engagement plus an exploration bonus
import random

def rank(user, candidates, beta=0.2):
  def value(item):
      exploit = score(user, item)             # predicted engagement
      explore = beta * uncertainty(user, item) # bonus for the unknown
      return exploit + explore
  feed = sorted(candidates, key=value, reverse=True)[:k]
  random.shuffle(feed[:3])                     # a little diversity at the top
  return feed

You now know how your feed knows you

Pull it together and the whole feed is one pipeline:

  • It's a ranking — score every candidate, show the top few.
  • It learns your taste from implicit signals — your taps, not a survey.
  • Collaborative filtering recommends what people like you enjoyed.
  • Taste vectors compress users and items into a shared space for fast, general predictions.
  • Cold start falls back on content and popularity until your history arrives.
  • And the feedback loop means the feed shapes you as much as you shape it.

Knowing the machinery is its own kind of power: when the next "perfect" video autoplays, you'll know it's a scored guess inside a loop — and that a little deliberate exploration, on your side too, is how you climb out of the bubble.

Thanks for reading. If you enjoyed this, the other courses cover how ChatGPT works, how AI generates images, and how machines understand meaning.