A loop that mixes validation with side effects:
for item in items:
if not item.get("id"): # validation
raise ValueError("missing id")
send_alert(item["id"]) # side effectWhat happens with [good, good, bad, good]? Items 1 and 2 send. Item 3 raises. Items 4 doesn't run. Now you have two real-world artifacts and a half-broken state.
The fix: validate everything first, then act.
items = [
{"id": "a", "value": 5},
{"id": "b"}, # missing 'value'
{"value": 10}, # missing 'id'
{"id": "c", "value": 15},
]
# Phase 1 — validate (no side effects)
errors = []
for i, item in enumerate(items):
if "id" not in item:
errors.append((i, "missing id"))
if "value" not in item:
errors.append((i, "missing value"))
if errors:
for idx, msg in errors:
print(f"item {idx}: {msg}")
print(f"aborting: {len(errors)} errors")
else:
# Phase 2 — act (only reached if zero errors)
for item in items:
process(item)Four items in, four problems found before any send. Either everything's valid → process all; or any failure → abort with a summary.
All-or-nothing? What if I want to process the valid ones?
Both shapes are valid. Today's pattern is strict — if any input is bad, abort. Use it when items are conceptually one batch ("send today's standup to all 5 teams") and a partial run would confuse downstream.
Lenient alternative: filter to valid items, log which ones were dropped, process the rest. Use it when items are independent ("send each unread reply").
Pick by what the failure mode means downstream. Both are routine.
Why split phases instead of one combined loop?
Side effects are committed. Once you've sent 2 emails, you can't unsend them. A pre-flight phase is your last chance to fail cleanly. Once phase 1 passes, phase 2's loop is structurally simpler — no validation noise — because every item has already been verified.
# 1. Validate — collect all errors, mutate nothing external
errors = []
for i, item in enumerate(items):
if not valid(item):
errors.append((i, reason(item)))
# 2. Decide — abort if any errors, proceed if clean
if errors:
for idx, msg in errors:
log("error", idx=idx, msg=msg)
raise ValueError(f"{len(errors)} validation errors")
# 3. Act — only reached on clean validation
for item in items:
do_side_effect(item)Three phases. Phase 1 reads, doesn't write. Phase 2 is one decision. Phase 3 is the side-effect loop, with no validation logic interleaved.
# wrong — fails on first error
for item in items:
assert valid(item), "invalid"Fix one error, re-run, see the next error, fix, re-run — N round-trips for N errors.
# right — collects all errors
errors = [(i, why) for i, item in enumerate(items) if not valid(item)]One report, all problems, fix them all, one re-run. Way faster iteration.
"id" in itemisinstance(item["value"], int)0 < item["value"] < 1_000_000item["recipient"] in known_recipientsitem["name"].strip()A validator function per shape keeps the code clean:
def validate_item(item):
errors = []
if "id" not in item:
errors.append("missing id")
if not isinstance(item.get("value"), (int, float)):
errors.append("value must be numeric")
return errorsYesterday's CONFIG dict — validate it at script start:
assert CONFIG["threshold"] > 0, "threshold must be positive"
assert CONFIG["recipient"], "recipient required"
assert isinstance(CONFIG["max_failures"], int)A misconfigured CONFIG is just bad input. Catch it at line 5 of the script, not line 95.
| Strict | Lenient |
|---|---|
| Any error → abort | Skip bad items, log them, process the rest |
| Use when items are one logical batch | Use when items are independent |
| Easier to reason about | More forgiving in production |
You'll use both, depending on what the script does. Both follow the same pre-flight phase 1 — collect all errors first.
A loop that mixes validation with side effects:
for item in items:
if not item.get("id"): # validation
raise ValueError("missing id")
send_alert(item["id"]) # side effectWhat happens with [good, good, bad, good]? Items 1 and 2 send. Item 3 raises. Items 4 doesn't run. Now you have two real-world artifacts and a half-broken state.
The fix: validate everything first, then act.
items = [
{"id": "a", "value": 5},
{"id": "b"}, # missing 'value'
{"value": 10}, # missing 'id'
{"id": "c", "value": 15},
]
# Phase 1 — validate (no side effects)
errors = []
for i, item in enumerate(items):
if "id" not in item:
errors.append((i, "missing id"))
if "value" not in item:
errors.append((i, "missing value"))
if errors:
for idx, msg in errors:
print(f"item {idx}: {msg}")
print(f"aborting: {len(errors)} errors")
else:
# Phase 2 — act (only reached if zero errors)
for item in items:
process(item)Four items in, four problems found before any send. Either everything's valid → process all; or any failure → abort with a summary.
All-or-nothing? What if I want to process the valid ones?
Both shapes are valid. Today's pattern is strict — if any input is bad, abort. Use it when items are conceptually one batch ("send today's standup to all 5 teams") and a partial run would confuse downstream.
Lenient alternative: filter to valid items, log which ones were dropped, process the rest. Use it when items are independent ("send each unread reply").
Pick by what the failure mode means downstream. Both are routine.
Why split phases instead of one combined loop?
Side effects are committed. Once you've sent 2 emails, you can't unsend them. A pre-flight phase is your last chance to fail cleanly. Once phase 1 passes, phase 2's loop is structurally simpler — no validation noise — because every item has already been verified.
# 1. Validate — collect all errors, mutate nothing external
errors = []
for i, item in enumerate(items):
if not valid(item):
errors.append((i, reason(item)))
# 2. Decide — abort if any errors, proceed if clean
if errors:
for idx, msg in errors:
log("error", idx=idx, msg=msg)
raise ValueError(f"{len(errors)} validation errors")
# 3. Act — only reached on clean validation
for item in items:
do_side_effect(item)Three phases. Phase 1 reads, doesn't write. Phase 2 is one decision. Phase 3 is the side-effect loop, with no validation logic interleaved.
# wrong — fails on first error
for item in items:
assert valid(item), "invalid"Fix one error, re-run, see the next error, fix, re-run — N round-trips for N errors.
# right — collects all errors
errors = [(i, why) for i, item in enumerate(items) if not valid(item)]One report, all problems, fix them all, one re-run. Way faster iteration.
"id" in itemisinstance(item["value"], int)0 < item["value"] < 1_000_000item["recipient"] in known_recipientsitem["name"].strip()A validator function per shape keeps the code clean:
def validate_item(item):
errors = []
if "id" not in item:
errors.append("missing id")
if not isinstance(item.get("value"), (int, float)):
errors.append("value must be numeric")
return errorsYesterday's CONFIG dict — validate it at script start:
assert CONFIG["threshold"] > 0, "threshold must be positive"
assert CONFIG["recipient"], "recipient required"
assert isinstance(CONFIG["max_failures"], int)A misconfigured CONFIG is just bad input. Catch it at line 5 of the script, not line 95.
| Strict | Lenient |
|---|---|
| Any error → abort | Skip bad items, log them, process the rest |
| Use when items are one logical batch | Use when items are independent |
| Easier to reason about | More forgiving in production |
You'll use both, depending on what the script does. Both follow the same pre-flight phase 1 — collect all errors first.
Create a free account to get started. Paid plans unlock all tracks.