How much does zuzu.codes cost?

The starter track is free — read all lessons and practice for free. Full access to every track (current and future) is $14.99/month. Cancel anytime.

How long does each track take?

Each track is designed as a 30-day challenge — one lesson per day, about 15 minutes each. Go at your own pace, but the structure is built around daily consistency.

What's the lesson format?

Each lesson is a student-teacher dialogue with code examples, followed by a hands-on code challenge in an in-browser editor. You read, you understand, then you write real code.

Do I need prior coding experience?

Our beginner track starts from absolute zero — no prior experience needed. Advanced tracks build on earlier ones, and the platform tells you exactly where to start.

How is zuzu.codes different from freeCodeCamp or Codecademy?

zuzu.codes uses a structured 30-day track format with dialogue-based teaching, an in-browser code editor, and gamification (XP, streaks, progress tracking). The format builds genuine understanding through daily practice.

Multi-step recovery — Ai Mastery

Day 20 · ~11 min●

A multi-step agent picks a tool, calls it, reads the result, picks the next. Recovery is what happens when a tool fails: the agent picks a different tool and continues.

python

def agent_with_recovery(goal, tools):
    used = []
    last_error = None
    for tool in tools:
        try:
            result = tool(goal)
            used.append(tool.__name__)
            return {"used": used, "result": result}
        except Exception as e:
            used.append(f"{tool.__name__}-FAILED")
            last_error = e
            continue
    raise RuntimeError(f"all tools failed; last error: {last_error}")

This is just fallback chains again, isn't it?

The structure is similar; the responsibility is different. Fallback chain → "answer a query". Multi-step → "complete a goal that may need multiple successful tool calls in sequence". When a step in the sequence fails, the agent doesn't restart — it picks a different tool for that step and continues.

And how does the agent know which tool to pick next?

Two strategies. Static — a hardcoded ordered list of fallback tools (today's lesson). Dynamic — the LLM picks a tool based on the failure (production agent loops). For this curriculum we stay with static; agent loops are a separate track.

Multi-step recovery

python

class StepResult:
    def __init__(self, status, value=None, error=None, used_tool=None):
        self.status = status      # 'ok' | 'failed'
        self.value = value
        self.error = error
        self.used_tool = used_tool

def try_step(goal, tool_options):
    """Try each tool until one succeeds. Return StepResult with the tool that worked."""
    for tool in tool_options:
        try:
            value = tool(goal)
            return StepResult("ok", value, used_tool=tool.__name__)
        except Exception as e:
            continue
    return StepResult("failed", error="all tools exhausted")

Static vs dynamic recovery

Style	How tool is picked	Fits
Static	Hardcoded ordered list — try A, then B	Known failure modes; deterministic
Dynamic	LLM sees failure, picks next tool	Unknown failure space; production agents

Dynamic is more powerful but adds an LLM call per recovery — costlier, harder to test. Static is simpler and covers most production cases. Today's lesson is static.

What recovery is NOT

Retry — same tool, same args, hope for transient resolution
Fallback — different strategy for the same query (covered yesterday)
Recovery — different tool for the same step in a multi-step goal

The distinction blurs at the edges. The pattern is the same: when the primary path fails, route around the failure.

The failure log

python

used = ["tool_A-FAILED", "tool_B"]

Keep the trail of attempts. When the agent succeeds, you know how it got there. When it fails, you know what was tried. Production observability builds on this.

When NOT to recover

Auth failure on every tool — the user account is the issue, not the tool
Logical errors — the goal itself is unreachable; trying more tools won't help
Side-effect already fired — recovery after a partial commit can compound damage. Some failures are best treated as "halt and call a human"

Day 20 · ~11 min●

A multi-step agent picks a tool, calls it, reads the result, picks the next. Recovery is what happens when a tool fails: the agent picks a different tool and continues.

python

def agent_with_recovery(goal, tools):
    used = []
    last_error = None
    for tool in tools:
        try:
            result = tool(goal)
            used.append(tool.__name__)
            return {"used": used, "result": result}
        except Exception as e:
            used.append(f"{tool.__name__}-FAILED")
            last_error = e
            continue
    raise RuntimeError(f"all tools failed; last error: {last_error}")

This is just fallback chains again, isn't it?

And how does the agent know which tool to pick next?

Multi-step recovery

python

class StepResult:
    def __init__(self, status, value=None, error=None, used_tool=None):
        self.status = status      # 'ok' | 'failed'
        self.value = value
        self.error = error
        self.used_tool = used_tool

def try_step(goal, tool_options):
    """Try each tool until one succeeds. Return StepResult with the tool that worked."""
    for tool in tool_options:
        try:
            value = tool(goal)
            return StepResult("ok", value, used_tool=tool.__name__)
        except Exception as e:
            continue
    return StepResult("failed", error="all tools exhausted")

Static vs dynamic recovery

Style	How tool is picked	Fits
Static	Hardcoded ordered list — try A, then B	Known failure modes; deterministic
Dynamic	LLM sees failure, picks next tool	Unknown failure space; production agents

Dynamic is more powerful but adds an LLM call per recovery — costlier, harder to test. Static is simpler and covers most production cases. Today's lesson is static.

What recovery is NOT

Retry — same tool, same args, hope for transient resolution
Fallback — different strategy for the same query (covered yesterday)
Recovery — different tool for the same step in a multi-step goal

The distinction blurs at the edges. The pattern is the same: when the primary path fails, route around the failure.

The failure log

python

used = ["tool_A-FAILED", "tool_B"]

Keep the trail of attempts. When the agent succeeds, you know how it got there. When it fails, you know what was tried. Production observability builds on this.

When NOT to recover

Auth failure on every tool — the user account is the issue, not the tool
Logical errors — the goal itself is unreachable; trying more tools won't help
Side-effect already fired — recovery after a partial commit can compound damage. Some failures are best treated as "halt and call a human"

Multi-step recovery

Static vs dynamic recovery

What recovery is NOT

The failure log

When NOT to recover

Multi-step recovery

Static vs dynamic recovery

What recovery is NOT

The failure log

When NOT to recover

Multi-step recovery

Static vs dynamic recovery

What recovery is NOT

The failure log

When NOT to recover

Sign up to practice

Multi-step recovery

Static vs dynamic recovery

What recovery is NOT

The failure log

When NOT to recover

Sign up to practice