You have a filtered list of eligible respondents from filter_eligible. Each dict has an outcome field. To compute mean outcome you need those values extracted. In SPSS, how do you get a column of just the outcome values?
filter_eligible from Day 10 gives me the eligible respondents. In SPSS I'd select the column and copy it. In Python I'm guessing there's a loop?
A for loop. Each iteration hands you one respondent dict. You extract the outcome field and build a result list. enumerate() gives you the position too — useful for labelling:
respondents = [{"id": "R_001", "outcome": 4.5}, {"id": "R_002", "outcome": 3.8}]
for i, r in enumerate(respondents):
entry = {"id": r["id"], "outcome": r["outcome"]}
print(f"Row {i}: {entry}")Why enumerate? Can't I just use for r in respondents?
You can — enumerate is optional. But enumerate(respondents) gives (0, r), (1, r), ... so each iteration also has a position index. It's useful when you need row numbers in the output for audit purposes — exactly the kind of thing a reproducibility-minded reviewer might ask for.
So the result is a list of {"id": ..., "outcome": ...} dicts I can pass directly into a mean calculation?
That's the Week 4 pipeline preview. For now, extract and collect:
def compute_outcome_per_respondent(respondents: list) -> list:
results = []
for i, r in enumerate(respondents):
entry = {"id": r["id"], "outcome": r["outcome"]}
results.append(entry)
print(f"Processed {len(results)} respondents")
return resultsThree weeks of SPSS column operations, replaced by eight lines. And the IDs are preserved for the audit trail.
The IDs matter. A mean computed without a traceable row index is not reproducible — if someone asks which respondents contributed, you need the list. Keep IDs in every intermediate result.
A for loop iterates every item in a sequence. enumerate pairs each item with its zero-based index:
for i, r in enumerate(respondents):
# i is 0, 1, 2 ...
# r is each respondent dictresults = []
for r in collection:
processed = transform(r)
results.append(processed)range() alternativefor i in range(len(respondents)): gives you only the index — use respondents[i] to get the item. enumerate is cleaner when you need both.
A list of bare outcome values is opaque — you can't trace back to which respondent contributed which value. Keep id in every intermediate result for reproducibility.
You have a filtered list of eligible respondents from filter_eligible. Each dict has an outcome field. To compute mean outcome you need those values extracted. In SPSS, how do you get a column of just the outcome values?
filter_eligible from Day 10 gives me the eligible respondents. In SPSS I'd select the column and copy it. In Python I'm guessing there's a loop?
A for loop. Each iteration hands you one respondent dict. You extract the outcome field and build a result list. enumerate() gives you the position too — useful for labelling:
respondents = [{"id": "R_001", "outcome": 4.5}, {"id": "R_002", "outcome": 3.8}]
for i, r in enumerate(respondents):
entry = {"id": r["id"], "outcome": r["outcome"]}
print(f"Row {i}: {entry}")Why enumerate? Can't I just use for r in respondents?
You can — enumerate is optional. But enumerate(respondents) gives (0, r), (1, r), ... so each iteration also has a position index. It's useful when you need row numbers in the output for audit purposes — exactly the kind of thing a reproducibility-minded reviewer might ask for.
So the result is a list of {"id": ..., "outcome": ...} dicts I can pass directly into a mean calculation?
That's the Week 4 pipeline preview. For now, extract and collect:
def compute_outcome_per_respondent(respondents: list) -> list:
results = []
for i, r in enumerate(respondents):
entry = {"id": r["id"], "outcome": r["outcome"]}
results.append(entry)
print(f"Processed {len(results)} respondents")
return resultsThree weeks of SPSS column operations, replaced by eight lines. And the IDs are preserved for the audit trail.
The IDs matter. A mean computed without a traceable row index is not reproducible — if someone asks which respondents contributed, you need the list. Keep IDs in every intermediate result.
A for loop iterates every item in a sequence. enumerate pairs each item with its zero-based index:
for i, r in enumerate(respondents):
# i is 0, 1, 2 ...
# r is each respondent dictresults = []
for r in collection:
processed = transform(r)
results.append(processed)range() alternativefor i in range(len(respondents)): gives you only the index — use respondents[i] to get the item. enumerate is cleaner when you need both.
A list of bare outcome values is opaque — you can't trace back to which respondent contributed which value. Keep id in every intermediate result for reproducibility.
Nadia has a list of eligible respondent dicts, each containing `"id"` and `"outcome"` fields. Write `compute_outcome_per_respondent(respondents)` that iterates the list using a for loop with `enumerate` and returns a new list of `{"id": ..., "outcome": ...}` dicts — one per respondent.
Tap each step for scaffolded hints.
No blank-editor panic.