Real datasets are messy. You have records like {"name": " Alex ", "email": "ALEX@X.com", "score": 85} and {"name": "Blair", "email": "blair@x.com"} (no score). You want every record to have a clean name, a lowercase email, and a score (defaulting to 0 when missing). How do you normalize the whole list in one pass?
A list comprehension that builds a new dict for each record, with the fields cleaned individually?
Exactly. Each cleaning operation is a single call — .strip() for whitespace, .lower() for case, .get(key, default) for the missing score. Combine them in a dict literal inside a list comprehension:
[
{
"name": r["name"].strip(),
"email": r["email"].lower(),
"score": r.get("score", 0)
}
for r in records
]One list comprehension, one new dict per input record, every field cleaned in place.
And this doesn't mutate the original records? Each iteration builds a fresh dict?
Fresh dict every time. The comprehension produces new objects, so the inputs are untouched. That's what makes this pattern safe for pipelines — no hidden state changes:
def normalize_records(records: list) -> list:
return [
{"name": r["name"].strip(), "email": r["email"].lower(), "score": r.get("score", 0)}
for r in records
]If name is missing entirely, does r["name"].strip() crash?
With KeyError. For truly defensive code, use r.get("name", "").strip() — missing key treated as empty string. The test data here assumes names and emails are always present, but in production that .get() on every field is cheap insurance.
So this is comprehensions from Week 1 plus .get() from Day 25 plus string methods from general Python — composing everything into a cleaning pipeline.
That's the week-4 pattern. New dict per record, every field cleaned with a one-liner, composed inside a comprehension. The same shape scales to any normalization: trim, lowercase, default, convert type, validate.
TL;DR: A list comprehension of dict literals — clean each field per record, build a new list.
.strip(), .lower(), .upper(), .title()r.get("key", default) for safe lookup| Mess | Fix |
|---|---|
" Alex " | .strip() |
"ALEX@X.COM" | .lower() |
| missing key | .get(key, default) |
"123" (wanted int) | int(r.get("score", 0)) |
Layer cleaning steps inside the dict literal — one expression per field.
Real datasets are messy. You have records like {"name": " Alex ", "email": "ALEX@X.com", "score": 85} and {"name": "Blair", "email": "blair@x.com"} (no score). You want every record to have a clean name, a lowercase email, and a score (defaulting to 0 when missing). How do you normalize the whole list in one pass?
A list comprehension that builds a new dict for each record, with the fields cleaned individually?
Exactly. Each cleaning operation is a single call — .strip() for whitespace, .lower() for case, .get(key, default) for the missing score. Combine them in a dict literal inside a list comprehension:
[
{
"name": r["name"].strip(),
"email": r["email"].lower(),
"score": r.get("score", 0)
}
for r in records
]One list comprehension, one new dict per input record, every field cleaned in place.
And this doesn't mutate the original records? Each iteration builds a fresh dict?
Fresh dict every time. The comprehension produces new objects, so the inputs are untouched. That's what makes this pattern safe for pipelines — no hidden state changes:
def normalize_records(records: list) -> list:
return [
{"name": r["name"].strip(), "email": r["email"].lower(), "score": r.get("score", 0)}
for r in records
]If name is missing entirely, does r["name"].strip() crash?
With KeyError. For truly defensive code, use r.get("name", "").strip() — missing key treated as empty string. The test data here assumes names and emails are always present, but in production that .get() on every field is cheap insurance.
So this is comprehensions from Week 1 plus .get() from Day 25 plus string methods from general Python — composing everything into a cleaning pipeline.
That's the week-4 pattern. New dict per record, every field cleaned with a one-liner, composed inside a comprehension. The same shape scales to any normalization: trim, lowercase, default, convert type, validate.
TL;DR: A list comprehension of dict literals — clean each field per record, build a new list.
.strip(), .lower(), .upper(), .title()r.get("key", default) for safe lookup| Mess | Fix |
|---|---|
" Alex " | .strip() |
"ALEX@X.COM" | .lower() |
| missing key | .get(key, default) |
"123" (wanted int) | int(r.get("score", 0)) |
Layer cleaning steps inside the dict literal — one expression per field.
Write `normalize_records(records)` that returns a new list of dicts. For each record: strip whitespace from `name`, lowercase `email`, and default missing `score` to 0. Do not modify the inputs.
Tap each step for scaffolded hints.
No blank-editor panic.