Your wave data arrives as a CSV export. The first row is a header: id,age,treatment_group,outcome. Subsequent rows are respondents. In Python, how do you turn that text into the respondent dicts you've been working with?
summarize_group from Day 18 takes respondent dicts. I'd need to split the CSV text on newlines to get rows, then split each row on commas to get fields. That sounds like Day 4's string methods.
Exactly right. split('\n') gives you rows, split(',') gives you fields. The header row tells you the field names; subsequent rows give you the values. Combine them to build a dict per row:
lines = csv_text.strip().splitlines()
header = lines[0].split(',')
for line in lines[1:]:
values = line.split(',')
row = dict(zip(header, values))What does zip(header, values) do? I've seen zip but never used it.
zip pairs up two sequences element by element. zip(["id","age"], ["R_001","29"]) gives [("id","R_001"),("age","29")]. Wrapping that in dict() gives {"id": "R_001", "age": "29"}. One call pairs the header with the row values:
def parse_respondent_csv(csv_text: str) -> list:
lines = csv_text.strip().splitlines()
header = [h.strip() for h in lines[0].split(',')]
results = []
for line in lines[1:]:
values = [v.strip() for v in line.split(',')]
row = dict(zip(header, values))
row["age"] = float(row["age"])
row["outcome"] = float(row["outcome"])
results.append(row)
print(f"Parsed {len(results)} respondents from CSV")
return resultsSo I can pass the raw CSV text from the SurveyMonkey export and get back the exact same respondent dicts I've been using all week. The whole pipeline connects.
The pipeline that started with one formatted label now reads an entire wave export. That's two weeks of work snapping together.
And because the function takes text, it's testable. I don't need a file — I just write the CSV inline in the test case.
The Pyodide runtime has no pre-loaded files — so text-as-argument is not just testable, it's the only way to run. This constraint is also a reproducibility feature: the function is pure and its behaviour is fully determined by its inputs.
CSV is text. Manual parsing: split('\n') → rows, split(',') → fields.
lines = csv_text.strip().split('\n')
header = lines[0].split(',') # ["id", "age", ...]
for line in lines[1:]:
values = line.split(',') # ["R_001", "29", ...]
row = dict(zip(header, values)) # {"id": "R_001", "age": "29"}zip pairingzip(a, b) returns pairs: [(a[0], b[0]), (a[1], b[1]), ...]. dict(zip(header, values)) maps header names to row values in one expression.
CSV values are strings. Convert numeric fields with float() or int() before computing statistics.
Your wave data arrives as a CSV export. The first row is a header: id,age,treatment_group,outcome. Subsequent rows are respondents. In Python, how do you turn that text into the respondent dicts you've been working with?
summarize_group from Day 18 takes respondent dicts. I'd need to split the CSV text on newlines to get rows, then split each row on commas to get fields. That sounds like Day 4's string methods.
Exactly right. split('\n') gives you rows, split(',') gives you fields. The header row tells you the field names; subsequent rows give you the values. Combine them to build a dict per row:
lines = csv_text.strip().splitlines()
header = lines[0].split(',')
for line in lines[1:]:
values = line.split(',')
row = dict(zip(header, values))What does zip(header, values) do? I've seen zip but never used it.
zip pairs up two sequences element by element. zip(["id","age"], ["R_001","29"]) gives [("id","R_001"),("age","29")]. Wrapping that in dict() gives {"id": "R_001", "age": "29"}. One call pairs the header with the row values:
def parse_respondent_csv(csv_text: str) -> list:
lines = csv_text.strip().splitlines()
header = [h.strip() for h in lines[0].split(',')]
results = []
for line in lines[1:]:
values = [v.strip() for v in line.split(',')]
row = dict(zip(header, values))
row["age"] = float(row["age"])
row["outcome"] = float(row["outcome"])
results.append(row)
print(f"Parsed {len(results)} respondents from CSV")
return resultsSo I can pass the raw CSV text from the SurveyMonkey export and get back the exact same respondent dicts I've been using all week. The whole pipeline connects.
The pipeline that started with one formatted label now reads an entire wave export. That's two weeks of work snapping together.
And because the function takes text, it's testable. I don't need a file — I just write the CSV inline in the test case.
The Pyodide runtime has no pre-loaded files — so text-as-argument is not just testable, it's the only way to run. This constraint is also a reproducibility feature: the function is pure and its behaviour is fully determined by its inputs.
CSV is text. Manual parsing: split('\n') → rows, split(',') → fields.
lines = csv_text.strip().split('\n')
header = lines[0].split(',') # ["id", "age", ...]
for line in lines[1:]:
values = line.split(',') # ["R_001", "29", ...]
row = dict(zip(header, values)) # {"id": "R_001", "age": "29"}zip pairingzip(a, b) returns pairs: [(a[0], b[0]), (a[1], b[1]), ...]. dict(zip(header, values)) maps header names to row values in one expression.
CSV values are strings. Convert numeric fields with float() or int() before computing statistics.
Fatima's wave-3 data arrives as a CSV-formatted string with header `id,age,treatment_group,outcome`. Write `parse_respondent_csv(csv_text)` that splits on newlines and commas, pairs header with values using `zip`, converts `age` and `outcome` to float, and returns a list of respondent dicts.
Tap each step for scaffolded hints.
No blank-editor panic.