You have ten thousand parsed logs in a flat list. You want to know what happened hour by hour. What shape does the output take?
A dict where the key is the hour and the value is the list of logs for that hour. Something like {"2026-04-14 10": [...], "2026-04-14 11": [...]}.
Exactly. Extracting the hour is pure string slicing — our timestamps are ISO format, so the first 13 characters are date plus hour:
hour = log["timestamp"][:13]
# "2026-04-14 10:23:45" → "2026-04-14 10"That's the bucket label.
And then append the log to whatever's already there? I'd need to handle the case where the bucket doesn't exist yet.
Dicts have a built-in for that: dict.setdefault(key, default). If the key exists, it returns the current value. If not, it sets the key to default and returns that. Either way you get something append-friendly:
groups = {}
for log in logs:
hour = log["timestamp"][:13]
groups.setdefault(hour, []).append(log)One line per log. No if hour not in groups dance.
Why setdefault over the if not in form? Is it just shorter?
Shorter and more intent-revealing. setdefault says "get-or-create the list." The if not in form hides the intent under three lines of logic. For a loop running over a million entries, setdefault is also measurably faster — one lookup instead of two.
And this same pattern works for grouping by any field — level, IP, user. Just change the key expression.
Grouping is grouping. Key expression changes, loop body stays put. The shape is reusable for every per-X analysis you'll ever write.
setdefaultTL;DR: groups.setdefault(key, []).append(item) is the one-line pattern for building a list-valued dict.
setdefault(k, d) — returns existing value or inserts defaulttimestamp[:13] — slice to "YYYY-MM-DD HH" (date + hour)| Slice | Result |
|---|---|
[:10] | "2026-04-14" (day) |
[:13] | "2026-04-14 10" (hour) |
[:16] | "2026-04-14 10:23" (minute) |
You have ten thousand parsed logs in a flat list. You want to know what happened hour by hour. What shape does the output take?
A dict where the key is the hour and the value is the list of logs for that hour. Something like {"2026-04-14 10": [...], "2026-04-14 11": [...]}.
Exactly. Extracting the hour is pure string slicing — our timestamps are ISO format, so the first 13 characters are date plus hour:
hour = log["timestamp"][:13]
# "2026-04-14 10:23:45" → "2026-04-14 10"That's the bucket label.
And then append the log to whatever's already there? I'd need to handle the case where the bucket doesn't exist yet.
Dicts have a built-in for that: dict.setdefault(key, default). If the key exists, it returns the current value. If not, it sets the key to default and returns that. Either way you get something append-friendly:
groups = {}
for log in logs:
hour = log["timestamp"][:13]
groups.setdefault(hour, []).append(log)One line per log. No if hour not in groups dance.
Why setdefault over the if not in form? Is it just shorter?
Shorter and more intent-revealing. setdefault says "get-or-create the list." The if not in form hides the intent under three lines of logic. For a loop running over a million entries, setdefault is also measurably faster — one lookup instead of two.
And this same pattern works for grouping by any field — level, IP, user. Just change the key expression.
Grouping is grouping. Key expression changes, loop body stays put. The shape is reusable for every per-X analysis you'll ever write.
setdefaultTL;DR: groups.setdefault(key, []).append(item) is the one-line pattern for building a list-valued dict.
setdefault(k, d) — returns existing value or inserts defaulttimestamp[:13] — slice to "YYYY-MM-DD HH" (date + hour)| Slice | Result |
|---|---|
[:10] | "2026-04-14" (day) |
[:13] | "2026-04-14 10" (hour) |
[:16] | "2026-04-14 10:23" (minute) |
Write `group_by_hour(logs)` that takes a list of log dicts (each with a `timestamp` key in `"YYYY-MM-DD HH:MM:SS"` format) and returns a dict keyed by the first 13 characters of each timestamp (e.g. `"2026-04-14 10"`) with the full log dicts as values.
Tap each step for scaffolded hints.
No blank-editor panic.