parse_respondent_csv from Day 19 splits on commas. What happens if a respondent's treatment group label contains a comma — like "control, phase 2"?
The manual split would break it into two separate fields. The zip would misalign everything after that row.
That's exactly why the standard library has csv.DictReader. It handles quoted fields, escaped commas, and Windows-style line endings automatically. Feed it a StringIO wrapper around the CSV text and it behaves like a real file handle:
import csv, io
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
print(dict(row)) # each row is already a dict with header keysWhat's io.StringIO? I thought I needed an actual file to use csv.DictReader.
io.StringIO wraps a string and makes it behave like a file object — it has .read() and .readline() methods. csv.DictReader only needs something file-like. That's why we pass text as an argument instead of opening a real file: the function is pure, testable, and works in Pyodide without a filesystem.
So the upgrade from parse_respondent_csv to load_respondents_from_csv is literally swapping split(',') for csv.DictReader and getting robust parsing for free?
Four extra characters import and a StringIO wrap. Ten minutes of edge-case handling, purchased:
import csv, io
def load_respondents_from_csv(csv_text: str) -> list:
reader = csv.DictReader(io.StringIO(csv_text))
results = []
for row in reader:
row["age"] = float(row["age"])
row["outcome"] = float(row["outcome"])
results.append(dict(row))
print(f"Loaded {len(results)} respondents via DictReader")
return resultsWave 4 can have commas in the labels. The pipeline won't break. That's publishable robustness.
One remaining gap: DictReader returns all values as strings — even numeric columns. You still need to convert age and outcome with float(). If a cell is blank or non-numeric, float("") raises ValueError. Week 4's try/except lesson handles that edge case.
csv.DictReader reads CSV rows as dicts with header keys. io.StringIO wraps a string as a file-like object:
import csv, io
for row in csv.DictReader(io.StringIO(csv_text)):
# row is {"id": "R_001", "age": "29.0", ...}| Edge case | split(',') | csv.DictReader |
|---|---|---|
Quoted field: "control, phase 2" | Breaks | Handles |
| Empty field | Returns "" | Returns "" |
Windows line endings (\r\n) | May include \r | Strips |
All values come back as strings. Convert numeric fields: float(row["age"]), int(row["n"]).
parse_respondent_csv from Day 19 splits on commas. What happens if a respondent's treatment group label contains a comma — like "control, phase 2"?
The manual split would break it into two separate fields. The zip would misalign everything after that row.
That's exactly why the standard library has csv.DictReader. It handles quoted fields, escaped commas, and Windows-style line endings automatically. Feed it a StringIO wrapper around the CSV text and it behaves like a real file handle:
import csv, io
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
print(dict(row)) # each row is already a dict with header keysWhat's io.StringIO? I thought I needed an actual file to use csv.DictReader.
io.StringIO wraps a string and makes it behave like a file object — it has .read() and .readline() methods. csv.DictReader only needs something file-like. That's why we pass text as an argument instead of opening a real file: the function is pure, testable, and works in Pyodide without a filesystem.
So the upgrade from parse_respondent_csv to load_respondents_from_csv is literally swapping split(',') for csv.DictReader and getting robust parsing for free?
Four extra characters import and a StringIO wrap. Ten minutes of edge-case handling, purchased:
import csv, io
def load_respondents_from_csv(csv_text: str) -> list:
reader = csv.DictReader(io.StringIO(csv_text))
results = []
for row in reader:
row["age"] = float(row["age"])
row["outcome"] = float(row["outcome"])
results.append(dict(row))
print(f"Loaded {len(results)} respondents via DictReader")
return resultsWave 4 can have commas in the labels. The pipeline won't break. That's publishable robustness.
One remaining gap: DictReader returns all values as strings — even numeric columns. You still need to convert age and outcome with float(). If a cell is blank or non-numeric, float("") raises ValueError. Week 4's try/except lesson handles that edge case.
csv.DictReader reads CSV rows as dicts with header keys. io.StringIO wraps a string as a file-like object:
import csv, io
for row in csv.DictReader(io.StringIO(csv_text)):
# row is {"id": "R_001", "age": "29.0", ...}| Edge case | split(',') | csv.DictReader |
|---|---|---|
Quoted field: "control, phase 2" | Breaks | Handles |
| Empty field | Returns "" | Returns "" |
Windows line endings (\r\n) | May include \r | Strips |
All values come back as strings. Convert numeric fields: float(row["age"]), int(row["n"]).
Ravi needs a robust CSV parser for wave data that handles quoted fields and edge cases. Write `load_respondents_from_csv(csv_text)` using `csv.DictReader` and `io.StringIO` to parse the text into respondent dicts, converting `age` and `outcome` to float.
Tap each step for scaffolded hints.
No blank-editor panic.