Generators compose. The output of one is iterable, so it can be the input of another:
def numbers(n):
for i in range(n):
yield i
def evens(source):
for x in source:
if x % 2 == 0:
yield x
def squared(source):
for x in source:
yield x * x
pipeline = squared(evens(numbers(10)))
print(list(pipeline)) # [0, 4, 16, 36, 64]Reading from inside out?
Right. numbers(10) is the source — a generator yielding 0..9. Pass it into evens — a generator that loops over its source and yields only even values. Pass that into squared. Each layer is a generator that pulls from the layer below.
When list(pipeline) runs, what's the order things actually happen in?
Element by element. list calls next(pipeline). That goes into squared's body — for x in source — which calls next(evens(...)). That calls next(numbers(10)), gets 0. evens checks: 0 % 2 == 0? Yes — yields 0 to squared. squared yields 0 * 0 == 0 to list. list appends 0, calls next again. Now numbers yields 1; evens skips it (odd); numbers yields 2; evens yields it; squared yields 4. And so on.
One element flows through the entire pipeline before the next one starts. No intermediate lists are built.
That's the win — no list of evens, no list of squared evens. Just one value at a time.
Exactly. For 10 items the saving is meaningless; for 10 million it's the difference between a script that runs and one that crashes with MemoryError.
A generator that takes a source iterable, transforms it, and yields. Three of these chained together is a pipeline:
def numbers(n):
for i in range(n):
yield i
def evens(source):
for x in source:
if x % 2 == 0:
yield x
def squared(source):
for x in source:
yield x * x
pipeline = squared(evens(numbers(10)))
for v in pipeline:
print(v)
# 0, 4, 16, 36, 64The pipeline doesn't build any intermediate list. Each next() request bubbles up from the consumer (for v in pipeline) to the source (numbers(10)), one value at a time, through every transform.
list ──> squared ──> evens ──> numbers
next? next? next? → 0
4 ← yield ← 0 ← yield ← 0
(Drawing it on paper helps the first time.)
With N=10 the difference is invisible. With N=10,000,000:
# List version — builds three full lists
numbers = list(range(10_000_000))
evens = [x for x in numbers if x % 2 == 0]
squared = [x * x for x in evens]
# 200MB+ of memory peak
# Generator version — constant memory
for v in squared(evens(numbers(10_000_000))):
...
# A few KB at mostThe same pipeline written with generator expressions:
nums = (i for i in range(10))
evns = (x for x in nums if x % 2 == 0)
sqrs = (x * x for x in evns)
print(list(sqrs)) # [0, 4, 16, 36, 64]break worksA pipeline you can stop early without paying for the rest:
for v in squared(evens(numbers(1_000_000))):
if v > 100:
break
print(v)
# 0, 4, 16, 36, 64, 100 — and we stop. The remaining 999,990 values never run.This is the lazy-evaluation payoff: only the work you ask for actually happens.
# Defeats the point
intermediate = list(evens(numbers(1_000_000))) # builds a half-million-item list
result = list(squared(intermediate))Keep it generator-shaped end to end. list() only at the very edge — when you genuinely need a list (because you'll iterate it twice or check len).
Generators compose. The output of one is iterable, so it can be the input of another:
def numbers(n):
for i in range(n):
yield i
def evens(source):
for x in source:
if x % 2 == 0:
yield x
def squared(source):
for x in source:
yield x * x
pipeline = squared(evens(numbers(10)))
print(list(pipeline)) # [0, 4, 16, 36, 64]Reading from inside out?
Right. numbers(10) is the source — a generator yielding 0..9. Pass it into evens — a generator that loops over its source and yields only even values. Pass that into squared. Each layer is a generator that pulls from the layer below.
When list(pipeline) runs, what's the order things actually happen in?
Element by element. list calls next(pipeline). That goes into squared's body — for x in source — which calls next(evens(...)). That calls next(numbers(10)), gets 0. evens checks: 0 % 2 == 0? Yes — yields 0 to squared. squared yields 0 * 0 == 0 to list. list appends 0, calls next again. Now numbers yields 1; evens skips it (odd); numbers yields 2; evens yields it; squared yields 4. And so on.
One element flows through the entire pipeline before the next one starts. No intermediate lists are built.
That's the win — no list of evens, no list of squared evens. Just one value at a time.
Exactly. For 10 items the saving is meaningless; for 10 million it's the difference between a script that runs and one that crashes with MemoryError.
A generator that takes a source iterable, transforms it, and yields. Three of these chained together is a pipeline:
def numbers(n):
for i in range(n):
yield i
def evens(source):
for x in source:
if x % 2 == 0:
yield x
def squared(source):
for x in source:
yield x * x
pipeline = squared(evens(numbers(10)))
for v in pipeline:
print(v)
# 0, 4, 16, 36, 64The pipeline doesn't build any intermediate list. Each next() request bubbles up from the consumer (for v in pipeline) to the source (numbers(10)), one value at a time, through every transform.
list ──> squared ──> evens ──> numbers
next? next? next? → 0
4 ← yield ← 0 ← yield ← 0
(Drawing it on paper helps the first time.)
With N=10 the difference is invisible. With N=10,000,000:
# List version — builds three full lists
numbers = list(range(10_000_000))
evens = [x for x in numbers if x % 2 == 0]
squared = [x * x for x in evens]
# 200MB+ of memory peak
# Generator version — constant memory
for v in squared(evens(numbers(10_000_000))):
...
# A few KB at mostThe same pipeline written with generator expressions:
nums = (i for i in range(10))
evns = (x for x in nums if x % 2 == 0)
sqrs = (x * x for x in evns)
print(list(sqrs)) # [0, 4, 16, 36, 64]break worksA pipeline you can stop early without paying for the rest:
for v in squared(evens(numbers(1_000_000))):
if v > 100:
break
print(v)
# 0, 4, 16, 36, 64, 100 — and we stop. The remaining 999,990 values never run.This is the lazy-evaluation payoff: only the work you ask for actually happens.
# Defeats the point
intermediate = list(evens(numbers(1_000_000))) # builds a half-million-item list
result = list(squared(intermediate))Keep it generator-shaped end to end. list() only at the very edge — when you genuinely need a list (because you'll iterate it twice or check len).
Create a free account to get started. Paid plans unlock all tracks.