How much does zuzu.codes cost?

The starter track is free — read all lessons and practice for free. Full access to every track (current and future) is $14.99/month. Cancel anytime.

How long does each track take?

Each track is designed as a 30-day challenge — one lesson per day, about 15 minutes each. Go at your own pace, but the structure is built around daily consistency.

What's the lesson format?

Each lesson is a student-teacher dialogue with code examples, followed by a hands-on code challenge in an in-browser editor. You read, you understand, then you write real code.

Do I need prior coding experience?

Our beginner track starts from absolute zero — no prior experience needed. Advanced tracks build on earlier ones, and the platform tells you exactly where to start.

How is zuzu.codes different from freeCodeCamp or Codecademy?

zuzu.codes uses a structured 30-day track format with dialogue-based teaching, an in-browser code editor, and gamification (XP, streaks, progress tracking). The format builds genuine understanding through daily practice.

Build the Reproducible Analysis Pipeline Capstone — Python For Researchers

Day 28 · ~12 min●

Four weeks of functions. filter_eligible, safe_compute_outcome, group_by_treatment, treatment_summary, rank_groups_by_outcome. What does the capstone do that none of them do alone?

safe_compute_outcome from Day 27 handles missing outcomes. rank_groups_by_outcome from Day 26 ranks the groups. The capstone chains them all — one function that takes raw respondents and returns the full journal-ready summary. Including outlier group flagging.

Exactly. The pipeline: apply safe_compute_outcome to every respondent to repair missing fields, then filter_eligible to apply the pre-registered inclusion criterion, then treatment_summary for the stats, then rank_groups_by_outcome for the ranked table, then flag any group whose mean outcome exceeds 1.25× the overall mean as an outlier:

python

overall_mean = sum(r["outcome"] for r in eligible) / len(eligible)
outlier_groups = [g["group"] for g in ranked if g["mean_outcome"] > overall_mean * 1.25]

Why 1.25×? Is that a standard threshold?

It's the threshold you pre-registered. In code it's just a constant — change it to 1.5 for a more liberal threshold or 1.1 for a stricter one. The point is that it's explicit and versioned in the script, not hidden in an SPSS dialog.

So the return value is {"ranked_groups": [...], "outlier_groups": [...], "overall_mean_outcome": float} — everything a reviewer needs to verify the analysis.

The entire methods section output, generated by one function call:

python

def analysis_pipeline(respondents: list) -> dict:
    repaired = [{**r, "outcome": safe_compute_outcome(r)} for r in respondents]
    eligible = filter_eligible(repaired, 18.0)
    ranked = rank_groups_by_outcome(eligible)
    overall_mean = round(sum(r["outcome"] for r in eligible) / len(eligible), 2) if eligible else 0.0
    outlier_groups = [g["group"] for g in ranked if g["mean_outcome"] > overall_mean * 1.25]
    result = {"ranked_groups": ranked, "outlier_groups": outlier_groups, "overall_mean_outcome": overall_mean}
    print(f"Pipeline complete: {len(ranked)} groups, {len(outlier_groups)} outlier groups")
    return result

I ran this on my head using wave-3 data. Three treatment groups, all stats computed, one outlier flagged. My co-author said she wants to rerun it herself — I just send her the script.

The script is the reproducibility. Not the screenshot, not the SPSS output file, not the email attachment with "final_v3_revised.xlsx" in the name. The Python function, with its inputs documented and its constants named, is the analysis you can cite.

The Reproducible Analysis Pipeline

The capstone assembles every function from the track into one composable pipeline:

raw respondents
  → safe_compute_outcome (repair missing outcomes)
  → filter_eligible (apply inclusion criterion)
  → treatment_summary (N, mean outcome, mean age per group)
  → rank_groups_by_outcome (sorted by mean outcome)
  → flag outliers (groups > 1.25× overall mean)
  → return {ranked_groups, outlier_groups, overall_mean_outcome}

Why one function?

A pipeline function is a specification — it has a name, a signature, and a documented output shape. Co-authors can rerun it on new waves. Reviewers can read it. You can version it in Git. That's reproducibility.

`{**r, "outcome": value}` pattern

{**r, "key": new_value} creates a new dict copying all of r's fields and overwriting "key" with new_value. Non-mutating — r is unchanged.

Day 28 · ~12 min●

Four weeks of functions. filter_eligible, safe_compute_outcome, group_by_treatment, treatment_summary, rank_groups_by_outcome. What does the capstone do that none of them do alone?

python

overall_mean = sum(r["outcome"] for r in eligible) / len(eligible)
outlier_groups = [g["group"] for g in ranked if g["mean_outcome"] > overall_mean * 1.25]

Why 1.25×? Is that a standard threshold?

So the return value is {"ranked_groups": [...], "outlier_groups": [...], "overall_mean_outcome": float} — everything a reviewer needs to verify the analysis.

The entire methods section output, generated by one function call:

python

def analysis_pipeline(respondents: list) -> dict:
    repaired = [{**r, "outcome": safe_compute_outcome(r)} for r in respondents]
    eligible = filter_eligible(repaired, 18.0)
    ranked = rank_groups_by_outcome(eligible)
    overall_mean = round(sum(r["outcome"] for r in eligible) / len(eligible), 2) if eligible else 0.0
    outlier_groups = [g["group"] for g in ranked if g["mean_outcome"] > overall_mean * 1.25]
    result = {"ranked_groups": ranked, "outlier_groups": outlier_groups, "overall_mean_outcome": overall_mean}
    print(f"Pipeline complete: {len(ranked)} groups, {len(outlier_groups)} outlier groups")
    return result

I ran this on my head using wave-3 data. Three treatment groups, all stats computed, one outlier flagged. My co-author said she wants to rerun it herself — I just send her the script.

The Reproducible Analysis Pipeline

The capstone assembles every function from the track into one composable pipeline:

raw respondents
  → safe_compute_outcome (repair missing outcomes)
  → filter_eligible (apply inclusion criterion)
  → treatment_summary (N, mean outcome, mean age per group)
  → rank_groups_by_outcome (sorted by mean outcome)
  → flag outliers (groups > 1.25× overall mean)
  → return {ranked_groups, outlier_groups, overall_mean_outcome}

Why one function?

`{**r, "outcome": value}` pattern

{**r, "key": new_value} creates a new dict copying all of r's fields and overwriting "key" with new_value. Non-mutating — r is unchanged.

def analysis_pipeline(respondents: list) -> dict: repaired = [{**r, "outcome": safe_compute_outcome(r)} for r in respondents] eligible = filter_eligible(repaired, 18.0) ranked = rank_groups_by_outcome(eligible) overall_mean = round(sum(r["outcome"] for r in eligible) / len(eligible), 2) if eligible else 0.0 outlier_groups = [g["group"] for g in ranked if g["mean_outcome"] > overall_mean * 1.25] result = {"ranked_groups": ranked, "outlier_groups": outlier_groups, "overall_mean_outcome": overall_mean} print(f"Pipeline complete: {len(ranked)} groups, {len(outlier_groups)} outlier groups") return result

The Reproducible Analysis Pipeline

The capstone assembles every function from the track into one composable pipeline:

raw respondents → safe_compute_outcome (repair missing outcomes) → filter_eligible (apply inclusion criterion) → treatment_summary (N, mean outcome, mean age per group) → rank_groups_by_outcome (sorted by mean outcome) → flag outliers (groups > 1.25× overall mean) → return {ranked_groups, outlier_groups, overall_mean_outcome}

Why one function?

{**r, "outcome": value} pattern

{**r, "key": new_value} creates a new dict copying all of r's fields and overwriting "key" with new_value. Non-mutating — r is unchanged.

The Reproducible Analysis Pipeline

The capstone assembles every function from the track into one composable pipeline:

Why one function?

{**r, "outcome": value} pattern

{**r, "key": new_value} creates a new dict copying all of r's fields and overwriting "key" with new_value. Non-mutating — r is unchanged.

The Reproducible Analysis Pipeline

Why one function?

{**r, "outcome": value} pattern

The Reproducible Analysis Pipeline

Why one function?

{**r, "outcome": value} pattern

The Reproducible Analysis Pipeline

Why one function?

{**r, "outcome": value} pattern

The Reproducible Analysis Pipeline

Why one function?

{**r, "outcome": value} pattern

`{**r, "outcome": value}` pattern

`{**r, "outcome": value}` pattern

`{**r, "outcome": value}` pattern

`{**r, "outcome": value}` pattern