Yesterday you used datetime.fromisoformat() for clean ISO timestamps. But the ops team's legacy log files have timestamps like 07/Apr/2026:09:14:33 +0000 — Apache combined log format. fromisoformat() won't handle that. What now?
I need to tell Python exactly what format to expect. That's strptime — but I've been avoiding it because every time I look up the format codes I spend twenty minutes figuring out which % is which.
The names are genuinely bad. My mnemonic: f for format, p for parse. strftime formats a datetime into a string. strptime parses a string into a datetime. The f and p are the only difference in the name, and that's intentional — they're inverses of each other.
strftime and strptime — named by someone who thought brevity was more important than clarity. I'll commit to the mnemonic: f for format, p for parse.
The format codes you need for log timestamps:
from datetime import datetime
# Apache combined log format: 07/Apr/2026:09:14:33 +0000
apache_ts = "07/Apr/2026:09:14:33 +0000"
dt = datetime.strptime(apache_ts, "%d/%b/%Y:%H:%M:%S %z")
# Syslog format: Apr 7 09:14:33
syslog_ts = "Apr 7 09:14:33"
dt2 = datetime.strptime(syslog_ts, "%b %d %H:%M:%S")
# ISO 8601: 2026-04-07T09:14:33Z
iso_ts = "2026-04-07T09:14:33Z"
dt3 = datetime.strptime(iso_ts, "%Y-%m-%dT%H:%M:%SZ")%d is day, %b is abbreviated month name, %Y is four-digit year, %H is 24-hour hour, %M is minutes, %S is seconds. The %z at the end handles the timezone offset? And the format string is a template where I write the literal separators exactly as they appear — :, /, +0000?
Exactly. The format string is a picture of what the timestamp looks like, with % codes standing in for the variable parts. The literal separators — /, :, space, T, Z — appear verbatim in the format. %z handles timezone offsets like +0000 and -0500.
And strftime goes the other direction — if I have a datetime object and I want to write it in a specific format for a report:
Correct:
from datetime import datetime
dt = datetime(2026, 4, 7, 9, 14, 33)
print(dt.strftime("%Y-%m-%d")) # 2026-04-07
print(dt.strftime("%d/%b/%Y:%H:%M:%S +0000")) # 07/Apr/2026:09:14:33 +0000
print(dt.strftime("%A, %B %d")) # Tuesday, April 07
print(dt.strftime("%Y%m%d_%H%M%S")) # 20260407_091433%A is full weekday name, %B is full month name. So I can take any timestamp in the log, parse it to a datetime with strptime, and then format it however the ops team's report needs with strftime. Any format in, any format out.
That's the translator metaphor from the voice bible. strptime reads the foreign date format. strftime writes it in the format you need. The datetime object in the middle is format-neutral — it's just a moment in time. The format is a concern of the boundary, not the logic.
Today's problem: normalize a list of log entries where timestamps come in three different formats — Apache, syslog, and ISO 8601. Parse each one to a datetime with the right strptime format, compute some time-window statistics, and emit normalized ISO timestamps with strftime.
The tricky part is detecting which format applies to which string. The clean approach is a list of (format_string, pattern_indicator) pairs and trying each one:
from datetime import datetime
FORMATS = [
"%Y-%m-%dT%H:%M:%SZ", # ISO 8601
"%d/%b/%Y:%H:%M:%S %z", # Apache
"%b %d %H:%M:%S", # syslog
]
def parse_any_timestamp(ts: str) -> datetime | None:
for fmt in FORMATS:
try:
return datetime.strptime(ts.strip(), fmt)
except ValueError:
continue
return NoneTry each format in a loop, return the first one that works. The same try/except surgical catching from json.loads(), applied to timestamp parsing. The format list is the configuration; the loop is the mechanism.
Exactly the same pattern, new context. Tomorrow: math and statistics — after two weeks of text processing you get to use Python's built-in numerical toolkit. The ops team has response-time data and they want mean, median, standard deviation, and percentiles. All of it is in the standard library.
strftime and strptime are the two functions that make datetime objects practically useful for log processing. Without them, you can create datetime objects from known components (datetime(2026, 4, 7, 9, 14, 33)) or from ISO 8601 strings (datetime.fromisoformat()). With them, you can parse any timestamp format you encounter and produce any timestamp format you need.
The codes you need for log analysis: %Y (4-digit year), %y (2-digit year), %m (zero-padded month number), %b (abbreviated month name: Jan, Feb...), %B (full month name), %d (zero-padded day of month), %H (24-hour hour), %I (12-hour hour), %M (minutes), %S (seconds), %f (microseconds), %z (UTC offset: +HHMM or -HHMM), %Z (timezone name: UTC, EST), %a (abbreviated weekday: Mon, Tue...), %A (full weekday).
The format string is a template: literal characters appear verbatim, and % codes stand in for variable parts. "%Y-%m-%dT%H:%M:%SZ" matches "2026-04-07T09:14:33Z" exactly — the T and Z are literals.
datetime.strptime() raises ValueError if the string does not match the format. It does not return None. For multi-format parsing — trying a list of known formats — wrap each call in try/except ValueError and return None (or a sentinel) if all formats fail. This is the standard pattern for normalizing timestamp fields that come from heterogeneous sources.
%b and %B (abbreviated and full month names) are locale-sensitive on some platforms. "Apr" parses correctly on an English locale; on a French locale it might expect "avr". For log processing where portability matters, prefer numeric month formats (%m) when writing parsers that will run on servers with varied locale settings. When reading existing log files generated on English systems, %b is reliable.
datetime.isoformat() produces a guaranteed ISO 8601 string without requiring a format argument. For output to other systems, APIs, or storage, isoformat() is preferable to strftime("%Y-%m-%dT%H:%M:%S") — it's shorter and unambiguous. strftime is for output to humans or to legacy systems with specific format requirements.
datetime.strptime() compiles the format string on every call. For parsing millions of timestamps, datetime.strptime() in a loop is measurably slower than re.search() followed by manual component extraction. The performance difference only matters at high volume — profile before optimizing.
Sign up to write and run code in this lesson.
Yesterday you used datetime.fromisoformat() for clean ISO timestamps. But the ops team's legacy log files have timestamps like 07/Apr/2026:09:14:33 +0000 — Apache combined log format. fromisoformat() won't handle that. What now?
I need to tell Python exactly what format to expect. That's strptime — but I've been avoiding it because every time I look up the format codes I spend twenty minutes figuring out which % is which.
The names are genuinely bad. My mnemonic: f for format, p for parse. strftime formats a datetime into a string. strptime parses a string into a datetime. The f and p are the only difference in the name, and that's intentional — they're inverses of each other.
strftime and strptime — named by someone who thought brevity was more important than clarity. I'll commit to the mnemonic: f for format, p for parse.
The format codes you need for log timestamps:
from datetime import datetime
# Apache combined log format: 07/Apr/2026:09:14:33 +0000
apache_ts = "07/Apr/2026:09:14:33 +0000"
dt = datetime.strptime(apache_ts, "%d/%b/%Y:%H:%M:%S %z")
# Syslog format: Apr 7 09:14:33
syslog_ts = "Apr 7 09:14:33"
dt2 = datetime.strptime(syslog_ts, "%b %d %H:%M:%S")
# ISO 8601: 2026-04-07T09:14:33Z
iso_ts = "2026-04-07T09:14:33Z"
dt3 = datetime.strptime(iso_ts, "%Y-%m-%dT%H:%M:%SZ")%d is day, %b is abbreviated month name, %Y is four-digit year, %H is 24-hour hour, %M is minutes, %S is seconds. The %z at the end handles the timezone offset? And the format string is a template where I write the literal separators exactly as they appear — :, /, +0000?
Exactly. The format string is a picture of what the timestamp looks like, with % codes standing in for the variable parts. The literal separators — /, :, space, T, Z — appear verbatim in the format. %z handles timezone offsets like +0000 and -0500.
And strftime goes the other direction — if I have a datetime object and I want to write it in a specific format for a report:
Correct:
from datetime import datetime
dt = datetime(2026, 4, 7, 9, 14, 33)
print(dt.strftime("%Y-%m-%d")) # 2026-04-07
print(dt.strftime("%d/%b/%Y:%H:%M:%S +0000")) # 07/Apr/2026:09:14:33 +0000
print(dt.strftime("%A, %B %d")) # Tuesday, April 07
print(dt.strftime("%Y%m%d_%H%M%S")) # 20260407_091433%A is full weekday name, %B is full month name. So I can take any timestamp in the log, parse it to a datetime with strptime, and then format it however the ops team's report needs with strftime. Any format in, any format out.
That's the translator metaphor from the voice bible. strptime reads the foreign date format. strftime writes it in the format you need. The datetime object in the middle is format-neutral — it's just a moment in time. The format is a concern of the boundary, not the logic.
Today's problem: normalize a list of log entries where timestamps come in three different formats — Apache, syslog, and ISO 8601. Parse each one to a datetime with the right strptime format, compute some time-window statistics, and emit normalized ISO timestamps with strftime.
The tricky part is detecting which format applies to which string. The clean approach is a list of (format_string, pattern_indicator) pairs and trying each one:
from datetime import datetime
FORMATS = [
"%Y-%m-%dT%H:%M:%SZ", # ISO 8601
"%d/%b/%Y:%H:%M:%S %z", # Apache
"%b %d %H:%M:%S", # syslog
]
def parse_any_timestamp(ts: str) -> datetime | None:
for fmt in FORMATS:
try:
return datetime.strptime(ts.strip(), fmt)
except ValueError:
continue
return NoneTry each format in a loop, return the first one that works. The same try/except surgical catching from json.loads(), applied to timestamp parsing. The format list is the configuration; the loop is the mechanism.
Exactly the same pattern, new context. Tomorrow: math and statistics — after two weeks of text processing you get to use Python's built-in numerical toolkit. The ops team has response-time data and they want mean, median, standard deviation, and percentiles. All of it is in the standard library.
strftime and strptime are the two functions that make datetime objects practically useful for log processing. Without them, you can create datetime objects from known components (datetime(2026, 4, 7, 9, 14, 33)) or from ISO 8601 strings (datetime.fromisoformat()). With them, you can parse any timestamp format you encounter and produce any timestamp format you need.
The codes you need for log analysis: %Y (4-digit year), %y (2-digit year), %m (zero-padded month number), %b (abbreviated month name: Jan, Feb...), %B (full month name), %d (zero-padded day of month), %H (24-hour hour), %I (12-hour hour), %M (minutes), %S (seconds), %f (microseconds), %z (UTC offset: +HHMM or -HHMM), %Z (timezone name: UTC, EST), %a (abbreviated weekday: Mon, Tue...), %A (full weekday).
The format string is a template: literal characters appear verbatim, and % codes stand in for variable parts. "%Y-%m-%dT%H:%M:%SZ" matches "2026-04-07T09:14:33Z" exactly — the T and Z are literals.
datetime.strptime() raises ValueError if the string does not match the format. It does not return None. For multi-format parsing — trying a list of known formats — wrap each call in try/except ValueError and return None (or a sentinel) if all formats fail. This is the standard pattern for normalizing timestamp fields that come from heterogeneous sources.
%b and %B (abbreviated and full month names) are locale-sensitive on some platforms. "Apr" parses correctly on an English locale; on a French locale it might expect "avr". For log processing where portability matters, prefer numeric month formats (%m) when writing parsers that will run on servers with varied locale settings. When reading existing log files generated on English systems, %b is reliable.
datetime.isoformat() produces a guaranteed ISO 8601 string without requiring a format argument. For output to other systems, APIs, or storage, isoformat() is preferable to strftime("%Y-%m-%dT%H:%M:%S") — it's shorter and unambiguous. strftime is for output to humans or to legacy systems with specific format requirements.
datetime.strptime() compiles the format string on every call. For parsing millions of timestamps, datetime.strptime() in a loop is measurably slower than re.search() followed by manual component extraction. The performance difference only matters at high volume — profile before optimizing.