Week 2 built up your aggregation vocabulary — grouping, sets, sliding windows. What felt surprisingly clean?
Set intersection. I was going to write a nested loop for shared IPs and then & made it one line. And the hash-table speedup is free.
That's the move — see the shape, reach for the primitive. Python gives you hash sets for a reason; you don't have to reinvent O(n) lookup every time.
The lambda in sorted still feels a little abstract. When to use a lambda versus a named function?
For a one-line key extractor, lambda is canonical — key=lambda x: x[1] is the right tool for "sort by the second tuple element." Named functions earn their keep when the logic spans multiple lines or needs to be reused elsewhere.
This week you built five aggregation functions:
group_by_hour — dict grouping with setdefault(k, []).append(x)count_unique_users — set comprehension + len() for cardinalitytop_error_messages — frequency dict + sorted(.items(), key=lambda, reverse=True)[:n]shared_ips — set intersection with &sliding_window_max — generator with yield + range(len - size + 1)Meta-skill: choosing the right aggregation primitive — dict for grouping, set for uniqueness, generator for streaming analytics.
Week 2 built up your aggregation vocabulary — grouping, sets, sliding windows. What felt surprisingly clean?
Set intersection. I was going to write a nested loop for shared IPs and then & made it one line. And the hash-table speedup is free.
That's the move — see the shape, reach for the primitive. Python gives you hash sets for a reason; you don't have to reinvent O(n) lookup every time.
The lambda in sorted still feels a little abstract. When to use a lambda versus a named function?
For a one-line key extractor, lambda is canonical — key=lambda x: x[1] is the right tool for "sort by the second tuple element." Named functions earn their keep when the logic spans multiple lines or needs to be reused elsewhere.
This week you built five aggregation functions:
group_by_hour — dict grouping with setdefault(k, []).append(x)count_unique_users — set comprehension + len() for cardinalitytop_error_messages — frequency dict + sorted(.items(), key=lambda, reverse=True)[:n]shared_ips — set intersection with &sliding_window_max — generator with yield + range(len - size + 1)Meta-skill: choosing the right aggregation primitive — dict for grouping, set for uniqueness, generator for streaming analytics.