Yesterday you printed item a failed: KeyError. Readable. But three weeks from now, scripts piling up logs everywhere — how do you grep just the failures? Just the items in step 2? Just events with status=ok?
Structured logging. Each log line has a known shape — a label and key=value pairs. Easy to grep, easy to parse:
import json
def log(event, **fields):
print(json.dumps({"event": event, **fields}))
log("step", n=1, status="ok")
log("step", n=2, status="fail", error="KeyError")
log("summary", ok=1, fail=1)Output:
{"event": "step", "n": 1, "status": "ok"}
{"event": "step", "n": 2, "status": "fail", "error": "KeyError"}
{"event": "summary", "ok": 1, "fail": 1}Grep "event": "step" to see all step lines. Grep "status": "fail" to see all failures. Each line is valid JSON — pipe to jq for filtering.
Why JSON instead of step n=1 status=ok?
JSON is parseable. step n=1 status=ok is almost parseable but has edge cases (values with spaces, embedded equals signs, type ambiguity). JSON handles them. Every modern log analyzer reads JSON natively.
Is **fields Python magic?
Just kwargs unpacking — calls become dict items. log("step", n=1, status="ok") makes fields = {"n": 1, "status": "ok"}. Then {"event": event, **fields} merges them into a single dict for json.dumps. One small helper, replaces dozens of print(f"...") calls.
import json
def log(event, **fields):
print(json.dumps({"event": event, **fields}))Four lines. Use it everywhere instead of print.
| Free-form | Structured |
|---|---|
print("sent email to bob") | log("send", to="bob", status="ok") |
| Hard to grep — what counts as "sent"? | Easy: event="send" |
| Mixed shape across runs | Stable shape — every send has the same fields |
| Awkward to compare across days | Trivial — jq 'select(.status=="fail")' |
A standard small kit:
event — the thing happening (step, send, summary, error)status — outcome (ok, fail, skipped)n or i — iteration numberitem_id, recipient, event_id)error — class name, if applicableDon't include things you wouldn't want grepped — secrets, large blobs of body text. Keep each line tight.
Three natural log points per loop iteration:
for i, item in enumerate(items):
log("step", n=i, item=item, phase="start")
try:
do_work(item)
log("step", n=i, item=item, phase="ok")
except Exception as e:
log("step", n=i, item=item, phase="fail", error=type(e).__name__)Three lines per item: start, ok or fail. Compute on top of these. Hung script? Find the last phase="start" with no matching phase="ok" — that's where it died.
10 fields per log line is too many. Keep each line to:
More than that, you're embedding a row of a database in a log. Use a Sheet for that.
Replace print(f"...") calls one at a time:
# before
print(f"sending to {recipient}: {subject}")
# after
log("send", to=recipient, subject=subject)The migration costs a few characters per call site. The payoff is parseability for the rest of the script's life.
Yesterday you printed item a failed: KeyError. Readable. But three weeks from now, scripts piling up logs everywhere — how do you grep just the failures? Just the items in step 2? Just events with status=ok?
Structured logging. Each log line has a known shape — a label and key=value pairs. Easy to grep, easy to parse:
import json
def log(event, **fields):
print(json.dumps({"event": event, **fields}))
log("step", n=1, status="ok")
log("step", n=2, status="fail", error="KeyError")
log("summary", ok=1, fail=1)Output:
{"event": "step", "n": 1, "status": "ok"}
{"event": "step", "n": 2, "status": "fail", "error": "KeyError"}
{"event": "summary", "ok": 1, "fail": 1}Grep "event": "step" to see all step lines. Grep "status": "fail" to see all failures. Each line is valid JSON — pipe to jq for filtering.
Why JSON instead of step n=1 status=ok?
JSON is parseable. step n=1 status=ok is almost parseable but has edge cases (values with spaces, embedded equals signs, type ambiguity). JSON handles them. Every modern log analyzer reads JSON natively.
Is **fields Python magic?
Just kwargs unpacking — calls become dict items. log("step", n=1, status="ok") makes fields = {"n": 1, "status": "ok"}. Then {"event": event, **fields} merges them into a single dict for json.dumps. One small helper, replaces dozens of print(f"...") calls.
import json
def log(event, **fields):
print(json.dumps({"event": event, **fields}))Four lines. Use it everywhere instead of print.
| Free-form | Structured |
|---|---|
print("sent email to bob") | log("send", to="bob", status="ok") |
| Hard to grep — what counts as "sent"? | Easy: event="send" |
| Mixed shape across runs | Stable shape — every send has the same fields |
| Awkward to compare across days | Trivial — jq 'select(.status=="fail")' |
A standard small kit:
event — the thing happening (step, send, summary, error)status — outcome (ok, fail, skipped)n or i — iteration numberitem_id, recipient, event_id)error — class name, if applicableDon't include things you wouldn't want grepped — secrets, large blobs of body text. Keep each line tight.
Three natural log points per loop iteration:
for i, item in enumerate(items):
log("step", n=i, item=item, phase="start")
try:
do_work(item)
log("step", n=i, item=item, phase="ok")
except Exception as e:
log("step", n=i, item=item, phase="fail", error=type(e).__name__)Three lines per item: start, ok or fail. Compute on top of these. Hung script? Find the last phase="start" with no matching phase="ok" — that's where it died.
10 fields per log line is too many. Keep each line to:
More than that, you're embedding a row of a database in a log. Use a Sheet for that.
Replace print(f"...") calls one at a time:
# before
print(f"sending to {recipient}: {subject}")
# after
log("send", to=recipient, subject=subject)The migration costs a few characters per call site. The payoff is parseability for the rest of the script's life.
Create a free account to get started. Paid plans unlock all tracks.