Last week you formatted one respondent — one ID, one age, one outcome. Now imagine your wave-3 dataset has 300 respondents across five treatment groups. In SPSS, how do you get the count per group?
I'd run Frequencies, filter by group, screenshot each table. Which I've done forty times.
In Python, you don't filter and screenshot — you loop. A list holds all 300 respondents the way a SPSS dataframe holds all rows, except you can filter it, loop over every item, and stop the moment you find what you need.
How do you get from a list of 300 respondent dicts to grouped counts with a mean per group?
That's the arc of this week. You start by filtering to eligible respondents with a list check. Then a for loop extracts the outcome from each. A while loop with break finds the first outlier without scanning everything. A dict groups by treatment group the way a pivot does. And the final lesson nests those structures so each group carries its own N and mean. By Friday, grouping 300 respondents by treatment condition is a single function call.
filter_eligible: filter a list of respondent dicts by age minimumcompute_outcome_per_respondent: for loop over the list, extract outcome for eachfind_first_outlier: while loop + break — stop at the first respondent above a cutoffgroup_by_treatment: dict — accumulate respondents into treatment-group bucketstreatment_summary: nested structures — N, mean outcome, and mean age per groupGoal: replace a manual pivot table with a function that groups 300 respondents by treatment condition.
7 lessons this week
Last week you formatted one respondent — one ID, one age, one outcome. Now imagine your wave-3 dataset has 300 respondents across five treatment groups. In SPSS, how do you get the count per group?
I'd run Frequencies, filter by group, screenshot each table. Which I've done forty times.
In Python, you don't filter and screenshot — you loop. A list holds all 300 respondents the way a SPSS dataframe holds all rows, except you can filter it, loop over every item, and stop the moment you find what you need.
How do you get from a list of 300 respondent dicts to grouped counts with a mean per group?
That's the arc of this week. You start by filtering to eligible respondents with a list check. Then a for loop extracts the outcome from each. A while loop with break finds the first outlier without scanning everything. A dict groups by treatment group the way a pivot does. And the final lesson nests those structures so each group carries its own N and mean. By Friday, grouping 300 respondents by treatment condition is a single function call.
filter_eligible: filter a list of respondent dicts by age minimumcompute_outcome_per_respondent: for loop over the list, extract outcome for eachfind_first_outlier: while loop + break — stop at the first respondent above a cutoffgroup_by_treatment: dict — accumulate respondents into treatment-group bucketstreatment_summary: nested structures — N, mean outcome, and mean age per groupGoal: replace a manual pivot table with a function that groups 300 respondents by treatment condition.