I pulled the diff from your last sprint. The parse_records function. Look at the signature:
def parse_records(data, results=[]):
for record in data:
results.append(process(record))
return resultsA function that appends to a list. The default is empty if none is provided. I use that so callers can optionally pass in an existing list.
That default list is not empty when the second caller uses it. It is full of the first caller's records. Default arguments in Python are evaluated once, when the function is defined. Not when it is called. Every call that relies on the default is sharing the same object.
I have been writing that pattern for two years. I have it in four functions in production. One of them is in the order pipeline. Nobody flagged it.
It is one of the most common bugs in Python and one of the hardest to catch in testing, because the first call always works. The bug only appears when the function is called more than once in the same process. The fix is one line:
def parse_records(data, results=None):
if results is None:
results = []
for record in data:
results.append(process(record))
return resultsNone as the sentinel. A new list constructed inside the body on each call. The rule: if the default is a mutable type — list, dict, set — use None and construct inside the body.
Three more patterns you mentioned. What are they?
Bare except Exception. Here is what I see in the API client:
try:
result = fetch_data(api_url)
except Exception:
result = Noneexcept Exception catches MemoryError, RecursionError, and every bug in your own code. If process(record) raises a TypeError because you passed the wrong argument, that except will silently return None and you will spend an afternoon tracing why results are intermittently missing. Name the specific exceptions:
try:
result = fetch_data(api_url)
except (requests.Timeout, requests.ConnectionError, ValueError):
result = NoneWhen I do not know which exceptions a library raises, I cannot enumerate them. What do I catch then?
The library's base exception class. requests has requests.exceptions.RequestException. Catch that and you catch everything requests can raise, but not your own code's bugs. The line: catch what the library can do to you, not what the universe can do to Python.
I already searched my codebase. Eleven for i in range(len( patterns. Four of them never use i for anything except items[i].
Four direct refactors. range(len()) is C-style iteration imported into Python. It forces you to write items[i] on every access when for item in items gives you the value directly, and enumerate gives you both the index and the value when you need both. And the fifth: string concatenation in a loop.
# O(n²) — each += copies the entire accumulated string
report = ""
for record in records:
report += format_record(record)
# O(n) — collect once, join once
report = "".join(format_record(r) for r in records)I have a billing report generator that uses += on thousands of records. I thought list append would be slower than direct string append. That was completely backwards.
List append is O(1) amortized. The join does one allocation with known final size. The += loop does n allocations, each larger than the last. The difference is visible at a few hundred records and dramatic at thousands. We will measure it with timeit in Week 3. Save the billing script.
Five patterns to carry into Week 3: mutable defaults — use None sentinel. Bare except Exception — name the specific exceptions. String concatenation in loops — collect and join. range(len()) with no real index use — use direct iteration or enumerate. And deeply nested conditionals — guard clauses with early returns.
That is an accurate summary. Next week you stop guessing about performance. Your script takes 30 seconds to run, you have optimized the loops twice, and it has not gotten faster. We will measure it properly.
The five anti-patterns in this lesson share a property: they are syntactically valid Python that passes tests under normal conditions but fails in production under conditions that tests do not cover. Understanding why each one fails helps distinguish "I should fix this" from "I will fix this when it breaks."
Mutable defaults and the definition-time evaluation rule. In CPython, the code object for a function includes its default argument values as a tuple stored in __defaults__. This tuple is created once, when the def statement executes. The same tuple is reused on every call. For immutable defaults (integers, strings, tuples, None), sharing does not matter — they cannot be modified. For mutable defaults (lists, dicts, sets), every call that modifies the default modifies the shared object. The None sentinel pattern works because None is a singleton — if results is None tests for the specific singleton, and results = [] inside the body creates a new list stored on the frame, not on the function's __defaults__.
Bare except and BaseException vs. Exception. Python's exception hierarchy has two top-level classes under BaseException: Exception (all "expected" errors) and a set of system exceptions (SystemExit, KeyboardInterrupt, GeneratorExit). A bare except: catches BaseException and its subclasses — including all system exceptions. except Exception: catches everything under Exception but not the system exceptions. The problem with except Exception: is that it catches every programming error inside your own code (TypeError, AttributeError, NameError) and converts it to a silent None return or a swallowed error. The correct practice is to catch the most specific exception you can name, which makes every caught exception one you explicitly expected and designed around.
String concatenation and O(n²) allocation. Python strings are immutable. Every += on a string creates a new string object in memory: it allocates a block of size len(current) + len(addition), copies current into it, copies addition after it, and discards the old current. For n items, the total bytes copied is proportional to n². str.join() pre-computes the final length, allocates once, and copies each part once: O(n). CPython has an optimization that makes += faster in simple cases where the string has exactly one reference, but this optimization does not apply reliably in loops where the variable might be referenced elsewhere.
Guard clauses and McCabe complexity. The nested conditional anti-pattern increases McCabe cyclomatic complexity — the number of independent paths through a function. High complexity correlates with higher defect rates and lower maintainability. Guard clauses (early returns on failure conditions) reduce complexity without reducing capability: every code path is still handled, but the happy path is linear and the error paths exit immediately. Most static analysis tools flag functions with cyclomatic complexity above 10; deeply nested conditionals are a direct cause of high complexity scores.
Sign up to write and run code in this lesson.
I pulled the diff from your last sprint. The parse_records function. Look at the signature:
def parse_records(data, results=[]):
for record in data:
results.append(process(record))
return resultsA function that appends to a list. The default is empty if none is provided. I use that so callers can optionally pass in an existing list.
That default list is not empty when the second caller uses it. It is full of the first caller's records. Default arguments in Python are evaluated once, when the function is defined. Not when it is called. Every call that relies on the default is sharing the same object.
I have been writing that pattern for two years. I have it in four functions in production. One of them is in the order pipeline. Nobody flagged it.
It is one of the most common bugs in Python and one of the hardest to catch in testing, because the first call always works. The bug only appears when the function is called more than once in the same process. The fix is one line:
def parse_records(data, results=None):
if results is None:
results = []
for record in data:
results.append(process(record))
return resultsNone as the sentinel. A new list constructed inside the body on each call. The rule: if the default is a mutable type — list, dict, set — use None and construct inside the body.
Three more patterns you mentioned. What are they?
Bare except Exception. Here is what I see in the API client:
try:
result = fetch_data(api_url)
except Exception:
result = Noneexcept Exception catches MemoryError, RecursionError, and every bug in your own code. If process(record) raises a TypeError because you passed the wrong argument, that except will silently return None and you will spend an afternoon tracing why results are intermittently missing. Name the specific exceptions:
try:
result = fetch_data(api_url)
except (requests.Timeout, requests.ConnectionError, ValueError):
result = NoneWhen I do not know which exceptions a library raises, I cannot enumerate them. What do I catch then?
The library's base exception class. requests has requests.exceptions.RequestException. Catch that and you catch everything requests can raise, but not your own code's bugs. The line: catch what the library can do to you, not what the universe can do to Python.
I already searched my codebase. Eleven for i in range(len( patterns. Four of them never use i for anything except items[i].
Four direct refactors. range(len()) is C-style iteration imported into Python. It forces you to write items[i] on every access when for item in items gives you the value directly, and enumerate gives you both the index and the value when you need both. And the fifth: string concatenation in a loop.
# O(n²) — each += copies the entire accumulated string
report = ""
for record in records:
report += format_record(record)
# O(n) — collect once, join once
report = "".join(format_record(r) for r in records)I have a billing report generator that uses += on thousands of records. I thought list append would be slower than direct string append. That was completely backwards.
List append is O(1) amortized. The join does one allocation with known final size. The += loop does n allocations, each larger than the last. The difference is visible at a few hundred records and dramatic at thousands. We will measure it with timeit in Week 3. Save the billing script.
Five patterns to carry into Week 3: mutable defaults — use None sentinel. Bare except Exception — name the specific exceptions. String concatenation in loops — collect and join. range(len()) with no real index use — use direct iteration or enumerate. And deeply nested conditionals — guard clauses with early returns.
That is an accurate summary. Next week you stop guessing about performance. Your script takes 30 seconds to run, you have optimized the loops twice, and it has not gotten faster. We will measure it properly.
The five anti-patterns in this lesson share a property: they are syntactically valid Python that passes tests under normal conditions but fails in production under conditions that tests do not cover. Understanding why each one fails helps distinguish "I should fix this" from "I will fix this when it breaks."
Mutable defaults and the definition-time evaluation rule. In CPython, the code object for a function includes its default argument values as a tuple stored in __defaults__. This tuple is created once, when the def statement executes. The same tuple is reused on every call. For immutable defaults (integers, strings, tuples, None), sharing does not matter — they cannot be modified. For mutable defaults (lists, dicts, sets), every call that modifies the default modifies the shared object. The None sentinel pattern works because None is a singleton — if results is None tests for the specific singleton, and results = [] inside the body creates a new list stored on the frame, not on the function's __defaults__.
Bare except and BaseException vs. Exception. Python's exception hierarchy has two top-level classes under BaseException: Exception (all "expected" errors) and a set of system exceptions (SystemExit, KeyboardInterrupt, GeneratorExit). A bare except: catches BaseException and its subclasses — including all system exceptions. except Exception: catches everything under Exception but not the system exceptions. The problem with except Exception: is that it catches every programming error inside your own code (TypeError, AttributeError, NameError) and converts it to a silent None return or a swallowed error. The correct practice is to catch the most specific exception you can name, which makes every caught exception one you explicitly expected and designed around.
String concatenation and O(n²) allocation. Python strings are immutable. Every += on a string creates a new string object in memory: it allocates a block of size len(current) + len(addition), copies current into it, copies addition after it, and discards the old current. For n items, the total bytes copied is proportional to n². str.join() pre-computes the final length, allocates once, and copies each part once: O(n). CPython has an optimization that makes += faster in simple cases where the string has exactly one reference, but this optimization does not apply reliably in loops where the variable might be referenced elsewhere.
Guard clauses and McCabe complexity. The nested conditional anti-pattern increases McCabe cyclomatic complexity — the number of independent paths through a function. High complexity correlates with higher defect rates and lower maintainability. Guard clauses (early returns on failure conditions) reduce complexity without reducing capability: every code path is still handled, but the happy path is linear and the error paths exit immediately. Most static analysis tools flag functions with cyclomatic complexity above 10; deeply nested conditionals are a direct cause of high complexity scores.