How much does zuzu.codes cost?

The starter track is free — read all lessons and practice for free. Full access to every track (current and future) is $14.99/month. Cancel anytime.

How long does each track take?

Each track is designed as a 30-day challenge — one lesson per day, about 15 minutes each. Go at your own pace, but the structure is built around daily consistency.

What's the lesson format?

Each lesson is a student-teacher dialogue with code examples, followed by a hands-on code challenge in an in-browser editor. You read, you understand, then you write real code.

Do I need prior coding experience?

Our beginner track starts from absolute zero — no prior experience needed. Advanced tracks build on earlier ones, and the platform tells you exactly where to start.

How is zuzu.codes different from freeCodeCamp or Codecademy?

zuzu.codes uses a structured 30-day track format with dialogue-based teaching, an in-browser code editor, and gamification (XP, streaks, progress tracking). The format builds genuine understanding through daily practice.

Synthesis — agent + eval — Ai Patterns

Day 21 · ~12 min●

Three weeks of primitives. Today they all show up at once. Agent with 2 tools, eval suite with 3 cases, pass-rate threshold. The synthesis pattern.

python

from pydantic_ai import Agent
import re

VALUES = {"x": 10, "y": 20, "z": 30}

agent = Agent(model)

@agent.tool_plain
def add(a: int, b: int) -> int:
    """Return the sum of two integers."""
    return a + b

@agent.tool_plain
def lookup(key: str) -> int:
    """Look up the integer value associated with a single-letter key."""
    return VALUES[key]

cases = [
    ("What is x + y?",  30),
    ("What is x + 5?",  15),
    ("What is z + y?",  50),
]

def parse_int(s: str) -> int | None:
    digits = re.findall(r"-?\d+", s)
    return int(digits[-1]) if digits else None

pass_count = 0
for prompt, expected in cases:
    out = agent.run_sync(prompt).output
    got = parse_int(out)
    ok = got == expected
    pass_count += int(ok)
    print(f"  {'PASS' if ok else 'FAIL'} — {prompt} → got={got} expected={expected}")

print(f"\n{pass_count} / {len(cases)} passed")
assert pass_count >= 2, f"expected at least 2/3, got {pass_count}"

Tool calling (add + lookup), agent loop (2 tools, multi-step), output validation (regex parse the int), eval suite (3 cases), pass-rate threshold (≥ 2/3). Five primitives in 30 lines.

Right. Each primitive sits where it fits. The agent decides which tool to call (lookup or add). The eval cases force composition — case 1 needs two lookups + add; case 2 needs one lookup + add; case 3 needs two lookups + add. The verification is a Python threshold over the suite.

What's the deterministic post-processing for?

The agent's result.output is text — "The answer is 30." or "30" or "x + y equals 30." — varies by run. We strip non-digits to get the integer. Same pattern as week-1 of AI Foundations: validate the shape of the response, not the exact text.

Synthesis — week 3

Five primitives composed:

Primitive	From	What it does here
Tool calling (single)	L4	`add` registered with the agent
Multi-step tools	L5	`lookup` then `add` for cases needing both
Multi-tool agent	L18	Agent picks `lookup` vs `add` per task
Output validation	L11	`re.findall(r"-?\d+", out)` parses the integer
Eval suite + threshold	L19	3 cases, pass-rate ≥ 2/3

No new concepts. The exercise is putting them together on a small generic problem.

Why these specific cases

("What is x + y?",  30)   → lookup(x)=10, lookup(y)=20, add(10, 20)=30
("What is x + 5?",  15)   → lookup(x)=10, add(10, 5)=15
("What is z + y?",  50)   → lookup(z)=30, lookup(y)=20, add(30, 20)=50

Each case requires a different sequence of tool calls. Same agent, same toolset — the loop adapts per prompt. That's the agent's value: it composes tools without you writing the dispatch logic.

Why pass rate ≥ 2/3 instead of 3/3?

LLM sampling. Even with sharp prompts, occasional misses happen. 2/3 = 66% — passes when the agent works most of the time. Tightening to 3/3 would make the lesson flake on noise. Production thresholds depend on cost and severity.

Why parse the integer instead of substring-matching?

The agent might say "30" or "The answer is 30." or "It equals 30 in base 10." — all should pass. Parsing the last integer in the output is robust to wording variants. re.findall(r"-?\d+", s)[-1] does that.

Cost

Each case = 1 agent run = 2-4 LLM calls (one to plan, one or more to call tools, one to finalise). Three cases = ~9-12 quota slots. Substantial — but synthesis lessons are once-per-week.

What's NOT here

This synthesis is deliberately compact. No moderation, no self-critique, no chained prompts — those are different patterns. AI Patterns' final-week synthesis (L28) brings in real Composio and a wider compose.

Day 21 · ~12 min●

Three weeks of primitives. Today they all show up at once. Agent with 2 tools, eval suite with 3 cases, pass-rate threshold. The synthesis pattern.

python

from pydantic_ai import Agent
import re

VALUES = {"x": 10, "y": 20, "z": 30}

agent = Agent(model)

@agent.tool_plain
def add(a: int, b: int) -> int:
    """Return the sum of two integers."""
    return a + b

@agent.tool_plain
def lookup(key: str) -> int:
    """Look up the integer value associated with a single-letter key."""
    return VALUES[key]

cases = [
    ("What is x + y?",  30),
    ("What is x + 5?",  15),
    ("What is z + y?",  50),
]

def parse_int(s: str) -> int | None:
    digits = re.findall(r"-?\d+", s)
    return int(digits[-1]) if digits else None

pass_count = 0
for prompt, expected in cases:
    out = agent.run_sync(prompt).output
    got = parse_int(out)
    ok = got == expected
    pass_count += int(ok)
    print(f"  {'PASS' if ok else 'FAIL'} — {prompt} → got={got} expected={expected}")

print(f"\n{pass_count} / {len(cases)} passed")
assert pass_count >= 2, f"expected at least 2/3, got {pass_count}"

Tool calling (add + lookup), agent loop (2 tools, multi-step), output validation (regex parse the int), eval suite (3 cases), pass-rate threshold (≥ 2/3). Five primitives in 30 lines.

What's the deterministic post-processing for?

Synthesis — week 3

Five primitives composed:

Primitive	From	What it does here
Tool calling (single)	L4	`add` registered with the agent
Multi-step tools	L5	`lookup` then `add` for cases needing both
Multi-tool agent	L18	Agent picks `lookup` vs `add` per task
Output validation	L11	`re.findall(r"-?\d+", out)` parses the integer
Eval suite + threshold	L19	3 cases, pass-rate ≥ 2/3

No new concepts. The exercise is putting them together on a small generic problem.

Why these specific cases

("What is x + y?",  30)   → lookup(x)=10, lookup(y)=20, add(10, 20)=30
("What is x + 5?",  15)   → lookup(x)=10, add(10, 5)=15
("What is z + y?",  50)   → lookup(z)=30, lookup(y)=20, add(30, 20)=50

Each case requires a different sequence of tool calls. Same agent, same toolset — the loop adapts per prompt. That's the agent's value: it composes tools without you writing the dispatch logic.

Why pass rate ≥ 2/3 instead of 3/3?

Why parse the integer instead of substring-matching?

Cost

Each case = 1 agent run = 2-4 LLM calls (one to plan, one or more to call tools, one to finalise). Three cases = ~9-12 quota slots. Substantial — but synthesis lessons are once-per-week.

Synthesis — week 3

Why these specific cases

Why pass rate ≥ 2/3 instead of 3/3?

Why parse the integer instead of substring-matching?

Cost

What's NOT here

Synthesis — week 3

Why these specific cases

Why pass rate ≥ 2/3 instead of 3/3?

Why parse the integer instead of substring-matching?

Cost

What's NOT here

Synthesis — week 3

Why these specific cases

Why pass rate ≥ 2/3 instead of 3/3?

Why parse the integer instead of substring-matching?

Cost

What's NOT here

Sign up to practice

Synthesis — week 3

Why these specific cases

Why pass rate ≥ 2/3 instead of 3/3?

Why parse the integer instead of substring-matching?

Cost

What's NOT here

Sign up to practice