Day 25 · ~12m●

Structured Logging

Implement JSON logging with request tracing and correlation IDs.

🧑‍💻

I've been using print() for debugging. What should I use in production?

👩‍🏫

Structured logging. Instead of print("something broke"), you emit JSON objects with consistent fields — level, timestamp, message, and context. Tools like Datadog, CloudWatch, and Grafana can parse, search, and alert on structured logs.

import json
from datetime import datetime

def log(level: str, message: str, **context):
    entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "level": level,
        "message": message,
        **context
    }
    print(json.dumps(entry))

log("info", "Server started", port=8000)
# {"timestamp": "2024-01-15T10:30:00", "level": "info", "message": "Server started", "port": 8000}

🧑‍💻

Why JSON instead of plain text?

👩‍🏫

Because machines need to parse your logs. With plain text like "[ERROR] Task 42 failed", you need regex to extract the task ID. With JSON, it's just entry["task_id"].

Compare:

# Plain text — hard to parse
[2024-01-15 10:30:00] ERROR: Task 42 failed - timeout after 30s

# Structured — machine-readable
{"timestamp": "2024-01-15T10:30:00", "level": "error", "message": "Task failed", "task_id": 42, "reason": "timeout", "duration_s": 30}

🧑‍💻

What's a correlation ID?

👩‍🏫

A unique identifier that connects all logs for a single request. When a user complains about an error, you search by their correlation ID and see every log entry from that request:

import uuid
from fastapi import Request

@app.middleware("http")
async def add_correlation_id(request: Request, call_next):
    correlation_id = str(uuid.uuid4())[:8]
    log("info", "Request started",
        correlation_id=correlation_id,
        method=request.method,
        path=str(request.url.path))
    response = await call_next(request)
    log("info", "Request completed",
        correlation_id=correlation_id,
        status=response.status_code)
    return response

Every log entry for that request includes correlation_id: "a1b2c3d4". Filter by it to see the complete request lifecycle.

🧑‍💻

What log levels should I use?

👩‍🏫

Four levels cover most cases:

debug — detailed info for development (log("debug", "Parsed 42 items", count=42))
info — normal operations (log("info", "Request completed", status=200))
warn — something unexpected but handled (log("warn", "Retry succeeded", attempts=3))
error — something failed (log("error", "Database timeout", query="SELECT..."))

In production, set the minimum level to info. In development, set it to debug. This way your debug logs exist in code but don't clutter production output.

Practice your skills

Already have an account? Sign in