zuzu.codeszuzu.codeszuzu.codes
zuzu.codeszuzu.codes

AI can write code — we teach you to read it, fix it, own it. One lesson, one challenge, every day for 30 days.

Compare

  • Compare All Platforms
  • vs Codecademy
  • vs freeCodeCamp
  • vs DataCamp
  • vs Exercism
  • vs LeetCode
  • vs Real Python

Myths & Facts

  • All Myths & Facts
  • Will AI Replace Coders?
  • Do I Need a CS Degree?
  • Am I Too Old to Code?
  • Do I Need Math?
  • Is Python Worth It?
  • Can I Learn in 30 Days?

Python For

  • All Professions
  • Data Analysts
  • Marketers
  • Finance
  • Product Managers
  • Students
  • Career Switchers

Roadmap

Tracks We're Building

  • Python Testing with Pytest▲ 32
  • Python Automation▲ 17
  • LLM APIs with Python▲ 14
  • Python Web APIs▲ 11
  • Python AI Agents▲ 10
  • View all

What's Getting Built

  • Custom daily reminder time▲ 61
  • Notes on lessons▲ 59
  • Email progress reminders▲ 53
  • Leaderboard▲ 52
  • Bookmarked lessons▲ 49
  • View all

What's Shipped

  • Intermediate Python▲ 79
  • Python Essentials▲ 58
  • Advanced Python▲ 55
  • Referral program▲ 47
  • PWA — installable on mobile▲ 42
  • View all
Have an idea? Vote on what we build next.
© 2026 zuzu.codes
Policy
📊 Python for Data Analysts

Python for Data Analysts

Stop copy-pasting formulas. Start writing scripts that do the work for you.

student (curious)

I'm a data analyst. I live in Excel. Why should I bother learning Python?

teacher (encouraging)

How much time do you spend cleaning data every week?

student (thinking)

Honestly? 6-8 hours. Removing duplicates, fixing formats, merging sheets from different sources.

teacher (serious)

Python can do all of that in under 30 seconds. A 10-line script cleans, merges, and formats data that takes you hours by hand. Here's what that Monday cleanup actually looks like:

python
import pandas as pd

df = pd.read_csv("sales_data.csv")
df = df.drop_duplicates()
df["date"] = pd.to_datetime(df["date"])
df["revenue"].fillna(0, inplace=True)
df.to_csv("sales_data_clean.csv", index=False)
print("Done. Cleaned", len(df), "rows.")

You write it once. Run it every Monday. Never think about it again.

student (curious)

But what about pandas? I keep hearing about it and it looks intimidating.

teacher (amused)

Pandas is just Excel on steroids — the concepts map almost directly. VLOOKUP becomes df.merge(), pivot tables become df.groupby(), filtering becomes df[df["revenue"] > 1000]. You already know the concepts. Pandas just removes the row limit and the crashes.

student (thinking)

My company uses Excel. Everyone sends me spreadsheets. Can Python work with that?

teacher (focused)

Python reads .xlsx files natively. Load a multi-sheet workbook, process every sheet, write results back — without opening Excel once. And it works with CSVs, Google Sheets, SQL databases, and APIs. Whatever your company throws at you.

student (excited)

That Monday cleanup takes me 2 hours every week. If I could automate that...

teacher (proud)

That's 100 hours a year you'd get back. And cohort analysis that takes a day in Excel takes 20 lines in pandas. Start with the Python Fundamentals track — by week 3 you'll be automating your first real workflow.

student (excited)

OK I'm starting today. My spreadsheets can wait.

The Full Picture

Why Data Analysts Are Moving to Python (And What Changes When You Do)

If you've spent years in Excel and Google Sheets, you're already a programmer — you just don't know it yet. Every SUMIFS, every VLOOKUP, every pivot table is a data transformation expressed in a domain-specific language. Python is that same logic, freed from a grid.

This isn't about replacing Excel. It's about knowing when Excel is the right tool and when Python does in 10 seconds what Excel can't finish at all.

The Row Limit Problem Is Real

Excel's 1,048,576-row limit sounds huge until you're working with transaction logs, clickstream data, or any dataset that accumulates daily. At 800K rows, Excel slows to a crawl. At 1M+, it crashes. Pandas loads 10 million rows in under 3 seconds on a laptop — and the analysis runs just as fast on 10 rows as on 10 million.

python
import pandas as pd

# 10 million rows — runs in under 3 seconds
df = pd.read_csv("transactions_2024.csv")
print(f"Loaded {len(df):,} rows")  # Loaded 10,000,000 rows

# Group and aggregate — no spinning wheel
summary = df.groupby(["region", "product_category"]).agg(
    total_revenue=("revenue", "sum"),
    order_count=("order_id", "count"),
    avg_order_value=("revenue", "mean")
).round(2)

The Mapping From Excel to Pandas

The conceptual jump is smaller than it looks:

Excel operationPandas equivalent
VLOOKUPdf.merge(other, on="id")
Pivot Tabledf.groupby("category").sum()
Remove blank rowsdf.dropna()
Filter rowsdf[df["revenue"] > 1000]
COUNTIFdf["status"].value_counts()
Sort A to Zdf.sort_values("date")
IF formuladf["flag"] = df["revenue"].apply(lambda x: "high" if x > 5000 else "low")
IFERRORpd.to_numeric(df["col"], errors="coerce")

Learning pandas is mostly re-learning what you already know under different syntax.

Automation: The Real Unlock

The biggest shift isn't the analysis — it's repeatability. In Excel, every Monday you open the file, run the same steps, and save the result. In Python, you write those steps once and schedule them:

python
# schedule with cron: 0 8 * * 1  (every Monday at 8 AM)
import pandas as pd
import smtplib
from email.mime.text import MIMEText

df = pd.read_csv("transactions.csv")
weekly = df.groupby("region")["revenue"].sum().reset_index()
weekly.columns = ["Region", "Total Revenue"]
weekly["Total Revenue"] = weekly["Total Revenue"].map("${:,.0f}".format)

table_html = weekly.to_html(index=False)
msg = MIMEText(f"<h2>Weekly Revenue</h2>{table_html}", "html")
msg["Subject"] = "Weekly Revenue Report — Auto-generated"
msg["From"] = "analytics@company.com"
msg["To"] = "team@company.com"
# connect to SMTP and send
print("Report sent.")

That script replaces 90 minutes of your Monday. Every Monday. Forever.

Before and After: A Typical Analyst Week

TaskBefore PythonAfter Python
Weekly data cleanup6-8 hours manual30-second script, scheduled
Merging 12 monthly sheets45 minutes of copy-pastepd.concat([xl.parse(s) for s in xl.sheet_names])
Cohort retention analysisFull day in Excel20-line pandas script
Ad-hoc revenue breakdownNew pivot table per requestReusable script with parameters
Stakeholder reportBuild from scratch weeklyScheduled email, auto-formatted

Cohort Analysis: Where Python Leaves Excel Behind

Retention analysis is the clearest example of Python doing something Excel genuinely cannot:

python
import pandas as pd

df = pd.read_csv("user_events.csv", parse_dates=["event_date"])

# Each user's acquisition month
first_activity = df.groupby("user_id")["event_date"].min().dt.to_period("M")
first_activity.name = "cohort"

df = df.join(first_activity, on="user_id")
df["months_since_start"] = (
    df["event_date"].dt.to_period("M") - df["cohort"]
).apply(lambda x: x.n)

cohort_table = df.groupby(["cohort", "months_since_start"])["user_id"].nunique().unstack()
retention = cohort_table.divide(cohort_table[0], axis=0)

This produces a full cohort retention matrix across every acquisition month. Building the equivalent in Excel requires hours of formulas and manual cross-referencing. In Python it's a one-time write.

Your 30-Day Progression

  • Week 1-2: Read CSVs and Excel files, filter and sort data, understand pandas basics
  • Week 3: Automate your first real cleanup task — replace one manual Monday workflow
  • Week 4: Build a reusable script your team starts using
  • Month 2: Connect to databases directly — no more CSV exports from IT
  • Month 3: Build a scheduled report that emails itself

The Business Case (for Your Manager)

If you earn $70K a year and spend 30% of your time on manual data wrangling, that's $21,000 of salary spent on work a script could do. Show your manager one automated report. That's the pitch. Analysts who learn Python don't just save time — they take on higher-value work, get noticed, and move up faster.

The analysts who get replaced by automation are the ones who only know Excel. The ones who stay are the ones who write the automation.

Think About It

Not syntax — just thinking. How would you solve these?

1.Your manager asks for a breakdown of total revenue by region for Q3. You have a CSV with 200,000 rows. What's the right approach?

2.A colleague sends you a dataset where dates are stored as strings like '03-15-2024'. Your analysis needs to calculate days between events. What do you do first?

3.You've built a weekly cleanup script that takes 5 seconds to run. Your manager wants the cleaned file emailed to the team every Monday at 8 AM. What's the best next step?

Try It Yourself

Build real Python step by step — runs right here in your browser.

Clean the Weekly Sales Data

You receive a raw sales CSV every Monday. It has duplicate rows, missing revenue values, and inconsistent region names ("North", "north", "NORTH" all mean the same thing). Write a function `clean_sales(rows)` that takes a list of row dicts and returns a cleaned list: - Remove exact duplicate rows - Fill missing revenue (None or missing key) with 0 - Normalize region names to title case (e.g. "north" → "North")

Tests
# clean_sales([{"id":1,"region":"north","revenue":500},{"id":1,"region":"north","revenue":500},{"id":2,"region":"SOUTH","revenue":null}])
[
  {
    "id": 1,
    "region": "North",
    "revenue": 500
  },
  {
    "id": 2,
    "region": "South",
    "revenue": 0
  }
]

Try zuzu.codes free

Start with the free Python track. No credit card required.

More Professions

🔄

Learn to Code for Career Switchers

💰

Python for Finance Professionals

📣

Coding for Marketers

🎯

Coding for Product Managers

Common Questions