Chapter 4 · Part 2

Pixel math

Part 1 was about what an image is: a tensor of numbers. Part 2 is about what we can do to it. We'll start with the simplest possible edits — the ones where the new value of a pixel depends only on its own old value.

Every brightness slider, every "invert colours" button, every black-and-white threshold is the same idea underneath: a function applied to every pixel. Feed in a pixel's value, get a new value out, do that for all of them.

Scroll to watch four classic operations. Keep an eye on all three panels — the image, the function that's being applied, and the histogram of brightnesses — they all move together.

Brightness: add the same number to every pixel — the whole histogram slides right.

scroll

The function is the whole story

That middle panel — the curve — is the operation. Read it as: find your input brightness on the horizontal axis, go up to the curve, read the new value off the vertical axis.

  • Brightness shifts the line up or down.
  • Contrast tilts it steeper around the middle grey.
  • Invert flips it into a downhill line.
  • Threshold bends it into a hard step.

Because the same curve is applied to every pixel independently, these are called point operations — the output pixel depends on one input pixel and nothing else. (Next chapter we'll break that rule, and that's where things get interesting.)

Reading the histogram

The histogram counts how many pixels fall at each brightness. It's the quickest way to see what an operation did, without staring at the picture:

  • A dark image piles up on the left; a bright one on the right.
  • Low contrast bunches into a narrow band; high contrast spreads wide.
  • Threshold leaves just two bars — everything is now black or white.
point_ops.py — operations are functions over an array
import numpy as np

def brightness(img, b):  return np.clip(img + b, 0, 255)
def contrast(img, k):    return np.clip(128 + (img - 128) * k, 0, 255)
def invert(img):         return 255 - img
def threshold(img, t):   return np.where(img < t, 0, 255)

hist, _ = np.histogram(img, bins=256, range=(0, 255))  # the histogram

Notice there's no loop over pixels — NumPy applies the function to the whole array at once. That's the same "operate on the entire tensor" mindset from Chapter 3.

The limit of point operations

Point operations are powerful but blind: each pixel is processed in total isolation. They can never detect an edge, a corner, or a texture, because those things are about how a pixel relates to its neighbours.

To see structure, an operation has to look at a little neighbourhood around each pixel. That's exactly what convolution does — and it's where Part 2 is heading next.