How much does zuzu.codes cost?

The starter track is free — read all lessons and practice for free. Full access to every track (current and future) is $14.99/month. Cancel anytime.

How long does each track take?

Each track is designed as a 30-day challenge — one lesson per day, about 15 minutes each. Go at your own pace, but the structure is built around daily consistency.

What's the lesson format?

Each lesson is a student-teacher dialogue with code examples, followed by a hands-on code challenge in an in-browser editor. You read, you understand, then you write real code.

Do I need prior coding experience?

Our beginner track starts from absolute zero — no prior experience needed. Advanced tracks build on earlier ones, and the platform tells you exactly where to start.

How is zuzu.codes different from freeCodeCamp or Codecademy?

zuzu.codes uses a structured 30-day track format with dialogue-based teaching, an in-browser code editor, and gamification (XP, streaks, progress tracking). The format builds genuine understanding through daily practice.

Cost dashboards — Ai Mastery

Day 25 · ~11 min●

You can't optimize what you can't see. Cost observability is the simplest production discipline: log every LLM call with version, latency, token count, and cost — then aggregate.

python

import time

def tracked_call(version, prompt):
    start = time.time()
    result = Agent(model).run_sync(prompt)
    elapsed = time.time() - start
    record = {
        "version": version,
        "prompt_chars": len(prompt),
        "answer_chars": len(result.output),
        "elapsed_s": round(elapsed, 2),
    }
    USAGE_LOG.append(record)
    return result.output

Every call appends a row. At the end of the run, dump the log to a Sheet for eyeballing — or aggregate in code (sum, mean) if you just want the totals.

Token count?

pydantic_ai's result.usage() exposes input/output token counts when the provider returns them. For a portable observability pattern, character count is a 95%-correct proxy that works on any model. (Token = ~4 chars for English; rough but useful.)

And the Sheet?

One row per call. Columns: timestamp, version, prompt-len, answer-len, elapsed. After a batch of calls, the Sheet is your cost dashboard — no separate observability service needed. We use Tasks today (auto-provisioned) for the same effect.

Cost dashboards — observability from primitives

python

import time

USAGE_LOG = []

def tracked_call(version, prompt):
    start = time.time()
    result = Agent(model).run_sync(prompt)
    elapsed = time.time() - start
    USAGE_LOG.append({
        "version": version,
        "prompt_chars": len(prompt),
        "answer_chars": len(result.output),
        "elapsed_s": round(elapsed, 2),
        "timestamp": time.time(),
    })
    return result.output

def summary():
    n = len(USAGE_LOG)
    if n == 0:
        return {"calls": 0}
    total_chars = sum(r["prompt_chars"] + r["answer_chars"] for r in USAGE_LOG)
    avg_latency = sum(r["elapsed_s"] for r in USAGE_LOG) / n
    return {
        "calls": n,
        "total_chars": total_chars,
        "avg_latency_s": round(avg_latency, 2),
    }

What you log

Field	Why
version	Group by prompt version — A/B comparison
prompt_chars / answer_chars	Cost proxy (~ tokens)
elapsed_s	Latency — slow calls have user-visible impact
timestamp	Trend over time
user_id (if multi-user)	Per-user spend
tool_used (if applicable)	Which subroutines burn budget

Where the log goes

In-memory list — for a single run, fine
JSON file — single-process, single-run, persists for analysis
Sheet / Tasks (Composio) — the simplest "shared dashboard" — anyone with the Sheet sees the data
PostHog / Datadog / Grafana — production observability platforms; same shape, more dashboards

Aggregations to watch

python

# Total cost (proxy)
total_chars = sum(r["prompt_chars"] + r["answer_chars"] for r in log)

# Per-version
from collections import defaultdict
by_version = defaultdict(list)
for r in log:
    by_version[r["version"]].append(r)
for v, rows in by_version.items():
    avg = sum(r["elapsed_s"] for r in rows) / len(rows)
    print(f"{v}: {len(rows)} calls, avg {avg:.2f}s")

When NOT to log

Sensitive content — log the metrics, not the prompt body. The summary is {prompt_chars: 412}, not {prompt: "customer says ..."} — never put PII in your dashboard.
Sub-100-call experiments — print is enough. Don't build dashboards for one-off scripts.
Hot path — the log itself shouldn't slow the call. Append to a list, batch-flush at the end.

Day 25 · ~11 min●

You can't optimize what you can't see. Cost observability is the simplest production discipline: log every LLM call with version, latency, token count, and cost — then aggregate.

python

import time

def tracked_call(version, prompt):
    start = time.time()
    result = Agent(model).run_sync(prompt)
    elapsed = time.time() - start
    record = {
        "version": version,
        "prompt_chars": len(prompt),
        "answer_chars": len(result.output),
        "elapsed_s": round(elapsed, 2),
    }
    USAGE_LOG.append(record)
    return result.output

Every call appends a row. At the end of the run, dump the log to a Sheet for eyeballing — or aggregate in code (sum, mean) if you just want the totals.

Token count?

And the Sheet?

Cost dashboards — observability from primitives

python

import time

USAGE_LOG = []

def tracked_call(version, prompt):
    start = time.time()
    result = Agent(model).run_sync(prompt)
    elapsed = time.time() - start
    USAGE_LOG.append({
        "version": version,
        "prompt_chars": len(prompt),
        "answer_chars": len(result.output),
        "elapsed_s": round(elapsed, 2),
        "timestamp": time.time(),
    })
    return result.output

def summary():
    n = len(USAGE_LOG)
    if n == 0:
        return {"calls": 0}
    total_chars = sum(r["prompt_chars"] + r["answer_chars"] for r in USAGE_LOG)
    avg_latency = sum(r["elapsed_s"] for r in USAGE_LOG) / n
    return {
        "calls": n,
        "total_chars": total_chars,
        "avg_latency_s": round(avg_latency, 2),
    }

What you log

Field	Why
version	Group by prompt version — A/B comparison
prompt_chars / answer_chars	Cost proxy (~ tokens)
elapsed_s	Latency — slow calls have user-visible impact
timestamp	Trend over time
user_id (if multi-user)	Per-user spend
tool_used (if applicable)	Which subroutines burn budget

Where the log goes

In-memory list — for a single run, fine
JSON file — single-process, single-run, persists for analysis
Sheet / Tasks (Composio) — the simplest "shared dashboard" — anyone with the Sheet sees the data
PostHog / Datadog / Grafana — production observability platforms; same shape, more dashboards

Aggregations to watch

python

# Total cost (proxy)
total_chars = sum(r["prompt_chars"] + r["answer_chars"] for r in log)

# Per-version
from collections import defaultdict
by_version = defaultdict(list)
for r in log:
    by_version[r["version"]].append(r)
for v, rows in by_version.items():
    avg = sum(r["elapsed_s"] for r in rows) / len(rows)
    print(f"{v}: {len(rows)} calls, avg {avg:.2f}s")

When NOT to log

Sensitive content — log the metrics, not the prompt body. The summary is {prompt_chars: 412}, not {prompt: "customer says ..."} — never put PII in your dashboard.
Sub-100-call experiments — print is enough. Don't build dashboards for one-off scripts.
Hot path — the log itself shouldn't slow the call. Append to a list, batch-flush at the end.

Cost dashboards — observability from primitives

What you log

Where the log goes

Aggregations to watch

When NOT to log

Cost dashboards — observability from primitives

What you log

Where the log goes

Aggregations to watch

When NOT to log

Cost dashboards — observability from primitives

What you log

Where the log goes

Aggregations to watch

When NOT to log

Sign up to practice

Cost dashboards — observability from primitives

What you log

Where the log goes

Aggregations to watch

When NOT to log

Sign up to practice