A function with yield instead of return is a generator. Compare:
def numbers_list(n):
result = []
for i in range(n):
result.append(i)
return result
def numbers_gen(n):
for i in range(n):
yield i
print(numbers_list(5)) # [0, 1, 2, 3, 4]
print(list(numbers_gen(5))) # [0, 1, 2, 3, 4]Same output — but the second one builds nothing until you ask. Each call to next() runs the body until the next yield, hands you that value, and pauses.
Why is that useful?
Two big reasons. Memory — numbers_list(10_000_000) builds a 10-million-item list. numbers_gen(10_000_000) produces one value at a time; the rest don't exist until you ask. Composability — generators can be chained. The output of one feeds the input of another, and each value flows through the whole pipeline before the next one starts. We'll see that tomorrow.
What does yield i actually do?
Two things at once. (1) Hand i back to whoever called next(). (2) Pause the function — local variables, the spot in the loop, all preserved. The next next() call resumes right after the yield. When the function reaches the end, Python raises StopIteration — a for loop catches that automatically and stops.
And list(generator)?
Drains the generator into a list — calls next() over and over until StopIteration. Useful for testing or when you actually need the whole thing in memory. For most pipelines, you stay generator-shaped end to end.
yield — generatorsA function with yield is a generator function. Calling it returns a generator object — not the result, but something you can iterate.
def count_up(n):
for i in range(n):
yield i
g = count_up(3)
print(next(g)) # 0
print(next(g)) # 1
print(next(g)) # 2
print(next(g)) # StopIterationEach next(g) runs the body until the next yield — then pauses, returning that value.
for automates next + StopIterationfor i in count_up(3):
print(i)
# 0
# 1
# 2This is what every for loop does — generators just make the protocol explicit.
1. Memory. A list of a million items takes a million slots in RAM. A generator producing them one at a time uses constant memory:
# RAM-heavy
result = [x * 2 for x in range(10_000_000)] # 80MB+ list
# RAM-light
def doubled():
for x in range(10_000_000):
yield x * 2
for v in doubled():
... # never holds more than one value in memory2. Lazy evaluation. You can break early and the rest never runs:
for v in count_up(1_000_000):
if v > 5:
break
# only iterated 7 values; the other 999,993 never happened3. Streaming. You can yield from an infinite source — log lines, network bytes, generated sequences. A generator can be infinite because it produces on demand.
list(generator) — drain into a listFor testing or when you actually need the whole thing:
print(list(count_up(5))) # [0, 1, 2, 3, 4]Don't list() a generator that's expensive or infinite — that defeats the lazy point.
Like list comprehensions but with ():
squares = (x * x for x in range(5))
print(list(squares)) # [0, 1, 4, 9, 16]Generator expressions are great for one-line transformations of an iterable. def ... yield is for anything more involved.
g = count_up(3)
list(g) # [0, 1, 2]
list(g) # [] — already exhaustedCalling the generator function gives a fresh generator each time:
list(count_up(3)) # [0, 1, 2]
list(count_up(3)) # [0, 1, 2] — fresh generatorA function with yield instead of return is a generator. Compare:
def numbers_list(n):
result = []
for i in range(n):
result.append(i)
return result
def numbers_gen(n):
for i in range(n):
yield i
print(numbers_list(5)) # [0, 1, 2, 3, 4]
print(list(numbers_gen(5))) # [0, 1, 2, 3, 4]Same output — but the second one builds nothing until you ask. Each call to next() runs the body until the next yield, hands you that value, and pauses.
Why is that useful?
Two big reasons. Memory — numbers_list(10_000_000) builds a 10-million-item list. numbers_gen(10_000_000) produces one value at a time; the rest don't exist until you ask. Composability — generators can be chained. The output of one feeds the input of another, and each value flows through the whole pipeline before the next one starts. We'll see that tomorrow.
What does yield i actually do?
Two things at once. (1) Hand i back to whoever called next(). (2) Pause the function — local variables, the spot in the loop, all preserved. The next next() call resumes right after the yield. When the function reaches the end, Python raises StopIteration — a for loop catches that automatically and stops.
And list(generator)?
Drains the generator into a list — calls next() over and over until StopIteration. Useful for testing or when you actually need the whole thing in memory. For most pipelines, you stay generator-shaped end to end.
yield — generatorsA function with yield is a generator function. Calling it returns a generator object — not the result, but something you can iterate.
def count_up(n):
for i in range(n):
yield i
g = count_up(3)
print(next(g)) # 0
print(next(g)) # 1
print(next(g)) # 2
print(next(g)) # StopIterationEach next(g) runs the body until the next yield — then pauses, returning that value.
for automates next + StopIterationfor i in count_up(3):
print(i)
# 0
# 1
# 2This is what every for loop does — generators just make the protocol explicit.
1. Memory. A list of a million items takes a million slots in RAM. A generator producing them one at a time uses constant memory:
# RAM-heavy
result = [x * 2 for x in range(10_000_000)] # 80MB+ list
# RAM-light
def doubled():
for x in range(10_000_000):
yield x * 2
for v in doubled():
... # never holds more than one value in memory2. Lazy evaluation. You can break early and the rest never runs:
for v in count_up(1_000_000):
if v > 5:
break
# only iterated 7 values; the other 999,993 never happened3. Streaming. You can yield from an infinite source — log lines, network bytes, generated sequences. A generator can be infinite because it produces on demand.
list(generator) — drain into a listFor testing or when you actually need the whole thing:
print(list(count_up(5))) # [0, 1, 2, 3, 4]Don't list() a generator that's expensive or infinite — that defeats the lazy point.
Like list comprehensions but with ():
squares = (x * x for x in range(5))
print(list(squares)) # [0, 1, 4, 9, 16]Generator expressions are great for one-line transformations of an iterable. def ... yield is for anything more involved.
g = count_up(3)
list(g) # [0, 1, 2]
list(g) # [] — already exhaustedCalling the generator function gives a fresh generator each time:
list(count_up(3)) # [0, 1, 2]
list(count_up(3)) # [0, 1, 2] — fresh generatorCreate a free account to get started. Paid plans unlock all tracks.