A log line looks like "2026-04-14 10:23:45 ERROR Database connection failed". You want a dict with timestamp, level, and message. Why not just line.split(" ")?
Because "Database connection failed" has two spaces in it — a plain split would shred the message into three pieces. I'd lose the shape.
Right. split() takes an optional second argument — maxsplit — that caps how many splits it does. Use it to keep the tail intact:
parts = line.split(" ", 2)
# ["2026-04-14", "10:23:45", "ERROR Database connection failed"]Two splits → three parts. Date, time, and then everything-else-as-one-string. Then I peel the level off the everything-else?
Exactly. Chain a second split on parts[2] with maxsplit=1:
parts = line.split(" ", 2)
timestamp = f"{parts[0]} {parts[1]}"
level, message = parts[2].split(" ", 1)Two targeted splits beat one greedy one every time.
Why the f-string on the timestamp? Can't I leave it as two separate fields?
You could, but real monitoring systems want one sortable timestamp column, not two. Gluing the date and time back together gives a single ISO-style string that sorts chronologically as a string compare — no datetime parsing needed.
So the whole parser is: split twice, glue timestamp, return a dict. Four lines — no regex, no libraries.
That's the point of reaching for split() before regex. Simple tool first, always. Save the regex heavyweights for patterns that aren't at fixed positions in the line.
str.split(sep, maxsplit)TL;DR: maxsplit caps the number of splits so the tail stays intact.
split(" ") — splits on every space (greedy)split(" ", 2) — splits at most twice, gives 3 partssplit(" ", 1) — splits once, gives 2 parts| Pass | Call | Result |
|---|---|---|
| 1 | line.split(" ", 2) | date, time, rest |
| 2 | parts[2].split(" ", 1) | level, message |
Unpack with level, message = ... when you know the shape — cleaner than rest[0] and rest[1].
Write `parse_log_line(line)` that takes a string like `"2026-04-14 10:23:45 ERROR Database connection failed"` and returns a dict with `timestamp`, `level`, and `message`. Use `split` with `maxsplit` — never a single greedy split.
Tap each step for scaffolded hints.
No blank-editor panic.
A log line looks like "2026-04-14 10:23:45 ERROR Database connection failed". You want a dict with timestamp, level, and message. Why not just line.split(" ")?
Because "Database connection failed" has two spaces in it — a plain split would shred the message into three pieces. I'd lose the shape.
Right. split() takes an optional second argument — maxsplit — that caps how many splits it does. Use it to keep the tail intact:
parts = line.split(" ", 2)
# ["2026-04-14", "10:23:45", "ERROR Database connection failed"]Two splits → three parts. Date, time, and then everything-else-as-one-string. Then I peel the level off the everything-else?
Exactly. Chain a second split on parts[2] with maxsplit=1:
parts = line.split(" ", 2)
timestamp = f"{parts[0]} {parts[1]}"
level, message = parts[2].split(" ", 1)Two targeted splits beat one greedy one every time.
Why the f-string on the timestamp? Can't I leave it as two separate fields?
You could, but real monitoring systems want one sortable timestamp column, not two. Gluing the date and time back together gives a single ISO-style string that sorts chronologically as a string compare — no datetime parsing needed.
So the whole parser is: split twice, glue timestamp, return a dict. Four lines — no regex, no libraries.
That's the point of reaching for split() before regex. Simple tool first, always. Save the regex heavyweights for patterns that aren't at fixed positions in the line.
str.split(sep, maxsplit)TL;DR: maxsplit caps the number of splits so the tail stays intact.
split(" ") — splits on every space (greedy)split(" ", 2) — splits at most twice, gives 3 partssplit(" ", 1) — splits once, gives 2 parts| Pass | Call | Result |
|---|---|---|
| 1 | line.split(" ", 2) | date, time, rest |
| 2 | parts[2].split(" ", 1) | level, message |
Unpack with level, message = ... when you know the shape — cleaner than rest[0] and rest[1].