You mentioned the billing report generator. The one with the string concatenation in a loop that you refactored last month. What did you think the speedup was when you changed it?
I thought maybe 10 to 20 percent. A minor cleanup at scale. I made the change because it looked cleaner, not because I had measured anything.
At 100 records: roughly 2x. At 1,000 records: roughly 10x. At 10,000 records: roughly 80x. The ratio grows with the input size. That is the signature of an O(n²) algorithm. The billing report processes about 8,000 records on a normal morning.
Eighty times slower at ten thousand records? That refactor was not a minor cleanup. I accidentally fixed a significant performance problem while trying to clean up the code.
And I want you to understand why, concretely. Strings in Python are immutable. Every += creates a new string object:
Iteration 1: copy 0 chars, write 50
Iteration 2: copy 50 chars, write 50
Iteration 3: copy 100 chars, write 50
...
Iteration n: copy (n-1)*50 chars, write 50Total bytes copied: 0 + 50 + 100 + ... + (n−1)×50. An arithmetic series. It grows as n². str.join receives the sequence, sums the lengths, allocates once, fills once. O(n).
I have seen "".join(format_record(r) for r in records) with a generator. Is that the same as building a list first?
Slightly slower, usually. str.join with a generator cannot pre-sum the lengths — it has to consume the generator into an internal list first to make two passes. The list comprehension inside join is typically the fastest form:
# Fastest: list comp inside join
report = "".join([format_record(r) for r in records])
# Slightly slower: generator (join materializes it anyway)
report = "".join(format_record(r) for r in records)
# Slowest at scale: += loop
report = ""
for r in records:
report += format_record(r)Generators are the leaner choice for memory, but not for join specifically. That is the opposite of what I would have assumed. I would have written the generator expression thinking it was more efficient.
General principles give you starting points. Measurement gives you answers. "Generators are leaner" is true for memory. It is not universally true for every function. Now — f-strings versus .format(). You have both in the codebase. Does it matter?
I have assumed f-strings are faster. I switched a few format calls in the order processor last month to get a speedup. I think I measured 3 to 5 percent.
The patient had a broken leg. You polished the shoes. The 3 percent constant-factor difference between f-string and .format() only matters after you have fixed the O(n²) problem. Fix the algorithmic complexity first. Constant factors come after. I have seen engineers spend a day switching to f-strings in a function that had a += loop two hundred lines away that was the actual bottleneck.
I actually did that. I spent a morning changing formatting calls in the order processor because I had read f-strings were faster. The script was still slow. I had not found the loop yet.
Now you have the sequence. cProfile to find the bottleneck. timeit to confirm the fix works. O(n²) patterns fixed first, constant factors after. The billing report fix you made without measuring — run timeit on it today with 1,000 words and see the actual ratio. The number you produce is yours. Not from a blog post — from your machine, your function, your input.
I want to see the actual number. If it is genuinely 10x at a thousand records, that changes how I write loops going forward. Not because I read it was better — because I measured it.
That is the Week 3 shift. Evidence, not intuition. Tomorrow is lru_cache — the standard memoization decorator. You mentioned the discount tier function that gets called hundreds of times per request. Bring it.
Why += on strings is O(n²). Python strings are immutable objects. str.__iadd__ (the += operator) cannot append to the existing string in place; it must allocate a new string object of size len(current) + len(addition), copy current into it, copy addition after it, and release the old current for garbage collection. CPython has an optimization for the simple case where the string being accumulated has a reference count of 1 — it can sometimes reallocate in place. But this optimization does not apply reliably in loops where the variable name may have multiple references, or where the list holding partial results is referenced elsewhere. The safe assumption is that every += in a loop does a full copy.
Why str.join is O(n). The CPython implementation of str.join makes two passes. In the first pass, it iterates the sequence to compute the total length of the result. It then allocates a single string buffer of exactly that size. In the second pass, it copies each element into the buffer at the correct offset. Total work: two linear passes over the sequence plus one allocation. This is why str.join requires that its argument be a sequence, not just an iterable — it needs to iterate it twice. When you pass a generator, Python materializes it into a list to enable the two-pass algorithm.
The list comprehension advantage for join. "".join([x for x in items]) is faster than "".join(x for x in items) for the reason above: the list is already materialized when join receives it, so join can make both passes directly. The generator version requires join to build an internal list first, then make the same two passes. The extra step is small but measurable in tight loops. The rule: when using join, pass a list or list comprehension, not a generator expression.
f-string versus .format() at the CPython level. f-strings are compiled to LOAD_FAST and FORMAT_VALUE bytecode instructions at compile time. .format() calls require a string parsing step at runtime to process the format spec. The difference is roughly 10-30% per call, which matters only in code that formats millions of strings per second. For application code that formats a few thousand records in a batch job, the difference is invisible next to any O(n²) pattern in the same function.
Sign up to write and run code in this lesson.
You mentioned the billing report generator. The one with the string concatenation in a loop that you refactored last month. What did you think the speedup was when you changed it?
I thought maybe 10 to 20 percent. A minor cleanup at scale. I made the change because it looked cleaner, not because I had measured anything.
At 100 records: roughly 2x. At 1,000 records: roughly 10x. At 10,000 records: roughly 80x. The ratio grows with the input size. That is the signature of an O(n²) algorithm. The billing report processes about 8,000 records on a normal morning.
Eighty times slower at ten thousand records? That refactor was not a minor cleanup. I accidentally fixed a significant performance problem while trying to clean up the code.
And I want you to understand why, concretely. Strings in Python are immutable. Every += creates a new string object:
Iteration 1: copy 0 chars, write 50
Iteration 2: copy 50 chars, write 50
Iteration 3: copy 100 chars, write 50
...
Iteration n: copy (n-1)*50 chars, write 50Total bytes copied: 0 + 50 + 100 + ... + (n−1)×50. An arithmetic series. It grows as n². str.join receives the sequence, sums the lengths, allocates once, fills once. O(n).
I have seen "".join(format_record(r) for r in records) with a generator. Is that the same as building a list first?
Slightly slower, usually. str.join with a generator cannot pre-sum the lengths — it has to consume the generator into an internal list first to make two passes. The list comprehension inside join is typically the fastest form:
# Fastest: list comp inside join
report = "".join([format_record(r) for r in records])
# Slightly slower: generator (join materializes it anyway)
report = "".join(format_record(r) for r in records)
# Slowest at scale: += loop
report = ""
for r in records:
report += format_record(r)Generators are the leaner choice for memory, but not for join specifically. That is the opposite of what I would have assumed. I would have written the generator expression thinking it was more efficient.
General principles give you starting points. Measurement gives you answers. "Generators are leaner" is true for memory. It is not universally true for every function. Now — f-strings versus .format(). You have both in the codebase. Does it matter?
I have assumed f-strings are faster. I switched a few format calls in the order processor last month to get a speedup. I think I measured 3 to 5 percent.
The patient had a broken leg. You polished the shoes. The 3 percent constant-factor difference between f-string and .format() only matters after you have fixed the O(n²) problem. Fix the algorithmic complexity first. Constant factors come after. I have seen engineers spend a day switching to f-strings in a function that had a += loop two hundred lines away that was the actual bottleneck.
I actually did that. I spent a morning changing formatting calls in the order processor because I had read f-strings were faster. The script was still slow. I had not found the loop yet.
Now you have the sequence. cProfile to find the bottleneck. timeit to confirm the fix works. O(n²) patterns fixed first, constant factors after. The billing report fix you made without measuring — run timeit on it today with 1,000 words and see the actual ratio. The number you produce is yours. Not from a blog post — from your machine, your function, your input.
I want to see the actual number. If it is genuinely 10x at a thousand records, that changes how I write loops going forward. Not because I read it was better — because I measured it.
That is the Week 3 shift. Evidence, not intuition. Tomorrow is lru_cache — the standard memoization decorator. You mentioned the discount tier function that gets called hundreds of times per request. Bring it.
Why += on strings is O(n²). Python strings are immutable objects. str.__iadd__ (the += operator) cannot append to the existing string in place; it must allocate a new string object of size len(current) + len(addition), copy current into it, copy addition after it, and release the old current for garbage collection. CPython has an optimization for the simple case where the string being accumulated has a reference count of 1 — it can sometimes reallocate in place. But this optimization does not apply reliably in loops where the variable name may have multiple references, or where the list holding partial results is referenced elsewhere. The safe assumption is that every += in a loop does a full copy.
Why str.join is O(n). The CPython implementation of str.join makes two passes. In the first pass, it iterates the sequence to compute the total length of the result. It then allocates a single string buffer of exactly that size. In the second pass, it copies each element into the buffer at the correct offset. Total work: two linear passes over the sequence plus one allocation. This is why str.join requires that its argument be a sequence, not just an iterable — it needs to iterate it twice. When you pass a generator, Python materializes it into a list to enable the two-pass algorithm.
The list comprehension advantage for join. "".join([x for x in items]) is faster than "".join(x for x in items) for the reason above: the list is already materialized when join receives it, so join can make both passes directly. The generator version requires join to build an internal list first, then make the same two passes. The extra step is small but measurable in tight loops. The rule: when using join, pass a list or list comprehension, not a generator expression.
f-string versus .format() at the CPython level. f-strings are compiled to LOAD_FAST and FORMAT_VALUE bytecode instructions at compile time. .format() calls require a string parsing step at runtime to process the format spec. The difference is roughly 10-30% per call, which matters only in code that formats millions of strings per second. For application code that formats a few thousand records in a batch job, the difference is invisible next to any O(n²) pattern in the same function.