How much does zuzu.codes cost?

The starter track is free — read all lessons and practice for free. Full access to every track (current and future) is $14.99/month. Cancel anytime.

How long does each track take?

Each track is designed as a 30-day challenge — one lesson per day, about 15 minutes each. Go at your own pace, but the structure is built around daily consistency.

What's the lesson format?

Each lesson is a student-teacher dialogue with code examples, followed by a hands-on code challenge in an in-browser editor. You read, you understand, then you write real code.

Do I need prior coding experience?

Our beginner track starts from absolute zero — no prior experience needed. Advanced tracks build on earlier ones, and the platform tells you exactly where to start.

How is zuzu.codes different from freeCodeCamp or Codecademy?

zuzu.codes uses a structured 30-day track format with dialogue-based teaching, an in-browser code editor, and gamification (XP, streaks, progress tracking). The format builds genuine understanding through daily practice.

Build the Full Thesis Analysis Pipeline — Python For Students

Day 28 · ~12 min●

Five weeks of functions. load_responses_from_csv, safe_compute_avg, rank_groups_by_satisfaction, demographic_summary, categorize_satisfaction. How do you want to connect them?

safe_compute_avg from yesterday makes the averaging crash-safe. I parse the CSV, compute the overall average safely, group by demographic, rank the groups, and flag any group with fewer than 10 responses. That's the whole methodology section in one chain.

That chain is the capstone. Every function you've built this track has a specific role. The pipeline's job is to call them in the right order and wrap the result in a structured output your advisor can read:

python

def thesis_pipeline(csv_text: str) -> dict:
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, "year")
    low_sample = [g for g in ranked if g["count"] < 10]
    return {
        "ranked_groups": ranked,
        "low_sample_groups": [g["group"] for g in low_sample],
        "overall_avg_satisfaction": overall_avg
    }

Should the pipeline take the grouping field as a parameter, or hardcode "year"?

Excellent instinct. Parameterise it — your advisor might want the cross-tab by major next week. But for the capstone the default is "year" because that's the primary analysis dimension. Default args let you do both:

python

def thesis_pipeline(csv_text: str, field: str = "year") -> dict:
    """Full thesis analysis pipeline: parse → clean → group → rank → flag low-sample."""
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, field)
    low_sample = [g["group"] for g in ranked if g["count"] < 10]
    print(f"Pipeline: {len(responses)} responses, {len(ranked)} groups, overall avg {overall_avg:.2f}")
    return {"ranked_groups": ranked, "low_sample_groups": low_sample, "overall_avg_satisfaction": overall_avg}

I ran this on my actual Qualtrics export. Cross-tabs matched SPSS output. My advisor wants the script. That's a real thesis deliverable.

Your committee reviewer just saved a weekend.

Twenty-five functions, four weeks. The capstone calls them all. This is what a reproducible methodology section looks like.

The pipeline is as good as its weakest function. safe_compute_avg handles bad data. load_responses_from_csv handles quoted fields. is_valid_response guards against missing keys. Each function does one thing well — the pipeline just connects them. That's the architecture of reliable code.

The Capstone Pipeline Pattern

A pipeline function chains specialised functions in sequence, each responsible for one transform:

csv_text → parse → safe_avg → group → rank → flag → output dict

Design principles

Single responsibility: each function does one thing
Safe defaults: error handling at the boundary functions
Parameterise, don't hardcode: pass field as an arg, not a string literal
Return structured output: a dict with named keys, not a print statement

Low-sample flagging

Groups with fewer than 10 responses are statistically unreliable. Flag them in the output so reviewers can note the limitation — don't silently exclude them.

Day 28 · ~12 min●

Five weeks of functions. load_responses_from_csv, safe_compute_avg, rank_groups_by_satisfaction, demographic_summary, categorize_satisfaction. How do you want to connect them?

safe_compute_avg from yesterday makes the averaging crash-safe. I parse the CSV, compute the overall average safely, group by demographic, rank the groups, and flag any group with fewer than 10 responses. That's the whole methodology section in one chain.

That chain is the capstone. Every function you've built this track has a specific role. The pipeline's job is to call them in the right order and wrap the result in a structured output your advisor can read:

python

def thesis_pipeline(csv_text: str) -> dict:
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, "year")
    low_sample = [g for g in ranked if g["count"] < 10]
    return {
        "ranked_groups": ranked,
        "low_sample_groups": [g["group"] for g in low_sample],
        "overall_avg_satisfaction": overall_avg
    }

Should the pipeline take the grouping field as a parameter, or hardcode "year"?

Excellent instinct. Parameterise it — your advisor might want the cross-tab by major next week. But for the capstone the default is "year" because that's the primary analysis dimension. Default args let you do both:

python

def thesis_pipeline(csv_text: str, field: str = "year") -> dict:
    """Full thesis analysis pipeline: parse → clean → group → rank → flag low-sample."""
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, field)
    low_sample = [g["group"] for g in ranked if g["count"] < 10]
    print(f"Pipeline: {len(responses)} responses, {len(ranked)} groups, overall avg {overall_avg:.2f}")
    return {"ranked_groups": ranked, "low_sample_groups": low_sample, "overall_avg_satisfaction": overall_avg}

I ran this on my actual Qualtrics export. Cross-tabs matched SPSS output. My advisor wants the script. That's a real thesis deliverable.

Your committee reviewer just saved a weekend.

Twenty-five functions, four weeks. The capstone calls them all. This is what a reproducible methodology section looks like.

The pipeline is as good as its weakest function. safe_compute_avg handles bad data. load_responses_from_csv handles quoted fields. is_valid_response guards against missing keys. Each function does one thing well — the pipeline just connects them. That's the architecture of reliable code.

The Capstone Pipeline Pattern

A pipeline function chains specialised functions in sequence, each responsible for one transform:

csv_text → parse → safe_avg → group → rank → flag → output dict

Design principles

Single responsibility: each function does one thing
Safe defaults: error handling at the boundary functions
Parameterise, don't hardcode: pass field as an arg, not a string literal
Return structured output: a dict with named keys, not a print statement

Low-sample flagging

Groups with fewer than 10 responses are statistically unreliable. Flag them in the output so reviewers can note the limitation — don't silently exclude them.

Day 28 · ~12 min●

Five weeks of functions. load_responses_from_csv, safe_compute_avg, rank_groups_by_satisfaction, demographic_summary, categorize_satisfaction. How do you want to connect them?

safe_compute_avg from yesterday makes the averaging crash-safe. I parse the CSV, compute the overall average safely, group by demographic, rank the groups, and flag any group with fewer than 10 responses. That's the whole methodology section in one chain.

That chain is the capstone. Every function you've built this track has a specific role. The pipeline's job is to call them in the right order and wrap the result in a structured output your advisor can read:

python

def thesis_pipeline(csv_text: str) -> dict:
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, "year")
    low_sample = [g for g in ranked if g["count"] < 10]
    return {
        "ranked_groups": ranked,
        "low_sample_groups": [g["group"] for g in low_sample],
        "overall_avg_satisfaction": overall_avg
    }

Should the pipeline take the grouping field as a parameter, or hardcode "year"?

Excellent instinct. Parameterise it — your advisor might want the cross-tab by major next week. But for the capstone the default is "year" because that's the primary analysis dimension. Default args let you do both:

python

def thesis_pipeline(csv_text: str, field: str = "year") -> dict:
    """Full thesis analysis pipeline: parse → clean → group → rank → flag low-sample."""
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, field)
    low_sample = [g["group"] for g in ranked if g["count"] < 10]
    print(f"Pipeline: {len(responses)} responses, {len(ranked)} groups, overall avg {overall_avg:.2f}")
    return {"ranked_groups": ranked, "low_sample_groups": low_sample, "overall_avg_satisfaction": overall_avg}

I ran this on my actual Qualtrics export. Cross-tabs matched SPSS output. My advisor wants the script. That's a real thesis deliverable.

Your committee reviewer just saved a weekend.

Twenty-five functions, four weeks. The capstone calls them all. This is what a reproducible methodology section looks like.

The pipeline is as good as its weakest function. safe_compute_avg handles bad data. load_responses_from_csv handles quoted fields. is_valid_response guards against missing keys. Each function does one thing well — the pipeline just connects them. That's the architecture of reliable code.

The Capstone Pipeline Pattern

A pipeline function chains specialised functions in sequence, each responsible for one transform:

csv_text → parse → safe_avg → group → rank → flag → output dict

Design principles

Single responsibility: each function does one thing
Safe defaults: error handling at the boundary functions
Parameterise, don't hardcode: pass field as an arg, not a string literal
Return structured output: a dict with named keys, not a print statement

Low-sample flagging

Groups with fewer than 10 responses are statistically unreliable. Flag them in the output so reviewers can note the limitation — don't silently exclude them.

Day 28 · ~12 min●

Five weeks of functions. load_responses_from_csv, safe_compute_avg, rank_groups_by_satisfaction, demographic_summary, categorize_satisfaction. How do you want to connect them?

safe_compute_avg from yesterday makes the averaging crash-safe. I parse the CSV, compute the overall average safely, group by demographic, rank the groups, and flag any group with fewer than 10 responses. That's the whole methodology section in one chain.

That chain is the capstone. Every function you've built this track has a specific role. The pipeline's job is to call them in the right order and wrap the result in a structured output your advisor can read:

python

def thesis_pipeline(csv_text: str) -> dict:
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, "year")
    low_sample = [g for g in ranked if g["count"] < 10]
    return {
        "ranked_groups": ranked,
        "low_sample_groups": [g["group"] for g in low_sample],
        "overall_avg_satisfaction": overall_avg
    }

Should the pipeline take the grouping field as a parameter, or hardcode "year"?

Excellent instinct. Parameterise it — your advisor might want the cross-tab by major next week. But for the capstone the default is "year" because that's the primary analysis dimension. Default args let you do both:

python

def thesis_pipeline(csv_text: str, field: str = "year") -> dict:
    """Full thesis analysis pipeline: parse → clean → group → rank → flag low-sample."""
    responses = load_responses_from_csv(csv_text)
    overall_avg = safe_compute_avg(responses)
    ranked = rank_groups_by_satisfaction(responses, field)
    low_sample = [g["group"] for g in ranked if g["count"] < 10]
    print(f"Pipeline: {len(responses)} responses, {len(ranked)} groups, overall avg {overall_avg:.2f}")
    return {"ranked_groups": ranked, "low_sample_groups": low_sample, "overall_avg_satisfaction": overall_avg}

I ran this on my actual Qualtrics export. Cross-tabs matched SPSS output. My advisor wants the script. That's a real thesis deliverable.

Your committee reviewer just saved a weekend.

Twenty-five functions, four weeks. The capstone calls them all. This is what a reproducible methodology section looks like.

The pipeline is as good as its weakest function. safe_compute_avg handles bad data. load_responses_from_csv handles quoted fields. is_valid_response guards against missing keys. Each function does one thing well — the pipeline just connects them. That's the architecture of reliable code.

The Capstone Pipeline Pattern

A pipeline function chains specialised functions in sequence, each responsible for one transform:

csv_text → parse → safe_avg → group → rank → flag → output dict

Design principles

Single responsibility: each function does one thing
Safe defaults: error handling at the boundary functions
Parameterise, don't hardcode: pass field as an arg, not a string literal
Return structured output: a dict with named keys, not a print statement

Low-sample flagging

Groups with fewer than 10 responses are statistically unreliable. Flag them in the output so reviewers can note the limitation — don't silently exclude them.

Privacy Terms

Problem

Alex is submitting your thesis methodology section and needs a single function that takes the raw Qualtrics CSV export as a string and returns the complete analysis: ranked demographic groups, names of low-sample groups (fewer than 10 responses), and the overall average satisfaction. Write `thesis_pipeline(csv_text, field='year')` that chains the full pipeline.

Tests

No output yet. Use print() to log values.

Problem

Alex is submitting your thesis methodology section and needs a single function that takes the raw Qualtrics CSV export as a string and returns the complete analysis: ranked demographic groups, names of low-sample groups (fewer than 10 responses), and the overall average satisfaction. Write `thesis_pipeline(csv_text, field='year')` that chains the full pipeline.

Tests

No output yet. Use print() to log values.