How much does zuzu.codes cost?

The starter track is free — read all lessons and practice for free. Full access to every track (current and future) is $14.99/month. Cancel anytime.

How long does each track take?

Each track is designed as a 30-day challenge — one lesson per day, about 15 minutes each. Go at your own pace, but the structure is built around daily consistency.

What's the lesson format?

Each lesson is a student-teacher dialogue with code examples, followed by a hands-on code challenge in an in-browser editor. You read, you understand, then you write real code.

Do I need prior coding experience?

Our beginner track starts from absolute zero — no prior experience needed. Advanced tracks build on earlier ones, and the platform tells you exactly where to start.

How is zuzu.codes different from freeCodeCamp or Codecademy?

zuzu.codes uses a structured 30-day track format with dialogue-based teaching, an in-browser code editor, and gamification (XP, streaks, progress tracking). The format builds genuine understanding through daily practice.

Python random: Generate Random Data for Testing — Python Standard Library

Python random: Generate Random Data for Testing — Python Standard Library | zuzu.codes

Day 20 · ~15m●

Yesterday you wrote a response-time statistics function. How are you going to test it? You need realistic input data — not just [100, 200, 300], but something that looks like actual server response times.

I could write test cases by hand... but that's fifty numbers typed manually and it won't catch distribution edge cases. I need synthetic data that looks like real server traffic — mostly fast responses with occasional spikes.

random module. Python's built-in controlled randomness. "Controlled" is the key word — seeded random is reproducible:

python

import random

# Seed for reproducibility — same seed, same sequence every time
rng = random.Random(42)

# Uniform distribution: any value between low and high with equal probability
response_time = rng.uniform(50, 400)
print(round(response_time, 1))  # 202.8 — same every time with seed=42

# Integer
status_code = rng.randint(200, 503)  # includes both endpoints

The seed makes the randomness reproducible. Every test run with random.Random(42) generates the exact same sequence. So my test assertions can check exact values, not just ranges.

Exactly. In test code, always seed. In simulation code where you want genuinely different output each run, don't seed or use random.seed(None) to seed from the OS clock. The instance approach — rng = random.Random(42) — is better than random.seed(42) at the module level because it doesn't affect other code that uses random.

How do I generate data that looks like real server traffic? Most requests are fast — under 300ms — but there are occasional spikes to 2000ms or more. Uniform distribution would give me too many slow requests.

random.gauss() gives you normal (bell curve) distribution — most values cluster around the mean, fewer values appear at the extremes:

python

import random

rng = random.Random(42)

def generate_response_times(n: int, mean_ms: float = 200, spike_rate: float = 0.05) -> list[float]:
    times = []
    for _ in range(n):
        if rng.random() < spike_rate:
            # Spike: slow request
            times.append(abs(rng.gauss(1500, 300)))
        else:
            # Normal request
            times.append(abs(rng.gauss(mean_ms, 50)))
    return times

sample = generate_response_times(1000)
print(f"Mean: {sum(sample)/len(sample):.1f}ms")
print(f"Max:  {max(sample):.1f}ms")

5% of requests are slow spikes around 1500ms, 95% are normal requests around 200ms. That's what real server traffic looks like — mostly nominal with occasional slow outliers. I can use this to test whether my percentile and standard deviation functions actually capture the spikes.

And random.choice() for selecting from a list, random.choices() for weighted selection — useful for generating error codes with realistic frequency:

python

import random

rng = random.Random(42)

log_levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
weights     = [5, 60, 25, 8, 2]  # relative frequencies

sample_levels = rng.choices(log_levels, weights=weights, k=100)
from collections import Counter
print(Counter(sample_levels).most_common())
# [('INFO', ~60), ('WARNING', ~25), ('DEBUG', ~5), ('ERROR', ~8), ('CRITICAL', ~2)]

random.choices() with weights — INFO appears 60 times out of 100 on average. That models the actual distribution of log levels in the ops team's data. Most entries are INFO, some are WARNING, a few are ERROR, almost none are CRITICAL.

And for shuffling a list — randomizing the order of log entries for testing that your analysis doesn't depend on ordering:

python

import random

rng = random.Random(42)
entries = list(range(10))
rng.shuffle(entries)        # in-place
print(entries)              # [0, 2, 7, ...]  — same permutation every time

random.sample(population, k) for picking k items without replacement — that's for sampling a large log file to inspect a representative subset. random.choices() is with replacement. random.sample() is without. Now I can generate realistic test data for every function I've written this week.

That's the week summary: statistics functions need realistic test inputs; random generates them. Tomorrow: collections — the highlight of Week 3. Counter.most_common(5) replaces twenty lines of manual dict accumulation. You're going to have a strong reaction.

The random Module: Reproducible Randomness for Testing

The random module generates pseudo-random numbers using the Mersenne Twister algorithm — fast, well-distributed, reproducible with a seed. "Pseudo-random" is the key qualifier: the sequence looks random but is deterministically determined by the initial seed. This is a feature for testing: seeded randomness produces the same data every run, so test assertions can check exact values.

Module-Level vs Instance API

The random module exposes two interfaces. The module-level functions (random.uniform(), random.gauss(), etc.) share a global state. random.seed(42) sets the state, affecting all subsequent calls anywhere in the program — including in libraries you import. The instance API — rng = random.Random(42) — creates an independent generator with its own state. For test code, always use the instance API to avoid contaminating the module-level state and to make the randomness explicit.

Distributions Available

random.uniform(a, b) — uniform distribution between a and b. random.gauss(mu, sigma) — normal (Gaussian) distribution with mean mu and standard deviation sigma. random.expovariate(lambd) — exponential distribution with rate parameter lambd, useful for modeling inter-arrival times in Poisson processes (like web server requests). random.lognormvariate(mu, sigma) — log-normal distribution, appropriate for response time modeling where most values are fast but the tail is long.

Sampling and Selection

random.choice(seq) — uniform selection of one element. random.choices(population, weights, k) — selection with replacement and optional weights. random.sample(population, k) — selection without replacement (no duplicates). random.shuffle(lst) — in-place shuffling of a list. The distinction between choices (with replacement) and sample (without) matters when the population is small relative to k.

Cryptographic vs Statistical Randomness

The random module is not cryptographically secure. For security-sensitive operations — generating tokens, session IDs, API keys — use secrets.token_hex(), secrets.token_urlsafe(), or secrets.choice(). The secrets module uses the OS's cryptographic random source. For statistical simulation and test data generation, random is appropriate and significantly faster.

Seeding Strategy

random.Random(seed) with an integer seed produces the same sequence every call. random.Random() with no seed uses the OS clock and system entropy — different each run. For unit tests: seed every generator. For simulation runs: no seed, or document the seed in the output for reproducibility. For production: never use random for security; always use secrets.

Practice your skills

Already have an account? Sign in

Day 20 · ~15m●

random module. Python's built-in controlled randomness. "Controlled" is the key word — seeded random is reproducible:

python

import random

# Seed for reproducibility — same seed, same sequence every time
rng = random.Random(42)

# Uniform distribution: any value between low and high with equal probability
response_time = rng.uniform(50, 400)
print(round(response_time, 1))  # 202.8 — same every time with seed=42

# Integer
status_code = rng.randint(200, 503)  # includes both endpoints

The seed makes the randomness reproducible. Every test run with random.Random(42) generates the exact same sequence. So my test assertions can check exact values, not just ranges.

random.gauss() gives you normal (bell curve) distribution — most values cluster around the mean, fewer values appear at the extremes:

python

import random

rng = random.Random(42)

def generate_response_times(n: int, mean_ms: float = 200, spike_rate: float = 0.05) -> list[float]:
    times = []
    for _ in range(n):
        if rng.random() < spike_rate:
            # Spike: slow request
            times.append(abs(rng.gauss(1500, 300)))
        else:
            # Normal request
            times.append(abs(rng.gauss(mean_ms, 50)))
    return times

sample = generate_response_times(1000)
print(f"Mean: {sum(sample)/len(sample):.1f}ms")
print(f"Max:  {max(sample):.1f}ms")

And random.choice() for selecting from a list, random.choices() for weighted selection — useful for generating error codes with realistic frequency:

python

import random

rng = random.Random(42)

log_levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
weights     = [5, 60, 25, 8, 2]  # relative frequencies

sample_levels = rng.choices(log_levels, weights=weights, k=100)
from collections import Counter
print(Counter(sample_levels).most_common())
# [('INFO', ~60), ('WARNING', ~25), ('DEBUG', ~5), ('ERROR', ~8), ('CRITICAL', ~2)]

And for shuffling a list — randomizing the order of log entries for testing that your analysis doesn't depend on ordering:

python

import random

rng = random.Random(42)
entries = list(range(10))
rng.shuffle(entries)        # in-place
print(entries)              # [0, 2, 7, ...]  — same permutation every time

The random Module: Reproducible Randomness for Testing

Module-Level vs Instance API

Distributions Available

Sampling and Selection

Cryptographic vs Statistical Randomness

Seeding Strategy

Day 20 · ~15m●

random module. Python's built-in controlled randomness. "Controlled" is the key word — seeded random is reproducible:

python

import random

# Seed for reproducibility — same seed, same sequence every time
rng = random.Random(42)

# Uniform distribution: any value between low and high with equal probability
response_time = rng.uniform(50, 400)
print(round(response_time, 1))  # 202.8 — same every time with seed=42

# Integer
status_code = rng.randint(200, 503)  # includes both endpoints

The seed makes the randomness reproducible. Every test run with random.Random(42) generates the exact same sequence. So my test assertions can check exact values, not just ranges.

random.gauss() gives you normal (bell curve) distribution — most values cluster around the mean, fewer values appear at the extremes:

python

import random

rng = random.Random(42)

def generate_response_times(n: int, mean_ms: float = 200, spike_rate: float = 0.05) -> list[float]:
    times = []
    for _ in range(n):
        if rng.random() < spike_rate:
            # Spike: slow request
            times.append(abs(rng.gauss(1500, 300)))
        else:
            # Normal request
            times.append(abs(rng.gauss(mean_ms, 50)))
    return times

sample = generate_response_times(1000)
print(f"Mean: {sum(sample)/len(sample):.1f}ms")
print(f"Max:  {max(sample):.1f}ms")

And random.choice() for selecting from a list, random.choices() for weighted selection — useful for generating error codes with realistic frequency:

python

import random

rng = random.Random(42)

log_levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
weights     = [5, 60, 25, 8, 2]  # relative frequencies

sample_levels = rng.choices(log_levels, weights=weights, k=100)
from collections import Counter
print(Counter(sample_levels).most_common())
# [('INFO', ~60), ('WARNING', ~25), ('DEBUG', ~5), ('ERROR', ~8), ('CRITICAL', ~2)]

And for shuffling a list — randomizing the order of log entries for testing that your analysis doesn't depend on ordering:

python

import random

rng = random.Random(42)
entries = list(range(10))
rng.shuffle(entries)        # in-place
print(entries)              # [0, 2, 7, ...]  — same permutation every time

The random Module: Reproducible Randomness for Testing

Module-Level vs Instance API

Distributions Available

Sampling and Selection

Cryptographic vs Statistical Randomness

Seeding Strategy

Practice your skills

Already have an account? Sign in

Day 20 · ~15m●

random module. Python's built-in controlled randomness. "Controlled" is the key word — seeded random is reproducible:

python

import random

# Seed for reproducibility — same seed, same sequence every time
rng = random.Random(42)

# Uniform distribution: any value between low and high with equal probability
response_time = rng.uniform(50, 400)
print(round(response_time, 1))  # 202.8 — same every time with seed=42

# Integer
status_code = rng.randint(200, 503)  # includes both endpoints

The seed makes the randomness reproducible. Every test run with random.Random(42) generates the exact same sequence. So my test assertions can check exact values, not just ranges.

random.gauss() gives you normal (bell curve) distribution — most values cluster around the mean, fewer values appear at the extremes:

python

import random

rng = random.Random(42)

def generate_response_times(n: int, mean_ms: float = 200, spike_rate: float = 0.05) -> list[float]:
    times = []
    for _ in range(n):
        if rng.random() < spike_rate:
            # Spike: slow request
            times.append(abs(rng.gauss(1500, 300)))
        else:
            # Normal request
            times.append(abs(rng.gauss(mean_ms, 50)))
    return times

sample = generate_response_times(1000)
print(f"Mean: {sum(sample)/len(sample):.1f}ms")
print(f"Max:  {max(sample):.1f}ms")

And random.choice() for selecting from a list, random.choices() for weighted selection — useful for generating error codes with realistic frequency:

python

import random

rng = random.Random(42)

log_levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
weights     = [5, 60, 25, 8, 2]  # relative frequencies

sample_levels = rng.choices(log_levels, weights=weights, k=100)
from collections import Counter
print(Counter(sample_levels).most_common())
# [('INFO', ~60), ('WARNING', ~25), ('DEBUG', ~5), ('ERROR', ~8), ('CRITICAL', ~2)]

And for shuffling a list — randomizing the order of log entries for testing that your analysis doesn't depend on ordering:

python

import random

rng = random.Random(42)
entries = list(range(10))
rng.shuffle(entries)        # in-place
print(entries)              # [0, 2, 7, ...]  — same permutation every time