Data at Scale
A look back at Week 3 — how lists and dictionaries turned one-at-a-time scripts into programs that handle real data.
Data at Scale
Three weeks ago, you didn't know what def meant.
Two weeks ago, you wrote functions that took input, made decisions, and looped through data. Your programs could think — evaluate conditions, filter results, accumulate totals. But every piece of data lived inside your code. A list of five names. A single score. One customer at a time.
This week, that changed completely.
On Day 17, you went back to numbers and saw the type system for what it really is — not a technicality, but a contract between your data and your code. int and float aren't just labels. They determine what happens when you divide, when you round, when you compare. You learned that 10 / 2 gives you 5.0 (a float, not an integer), that int() truncates while round() rounds properly, and that None isn't zero — it's the absence of a value. Like an empty cell in your spreadsheet, not a cell containing 0.
On Day 18, you discovered lists — and the scale problem vanished. Instead of customer1, customer2, customer3 stretching into absurdity, you had one name and any number of items: customers = ["Alice", "Bob", "Charlie", ...]. Indexing gave you random access. Negative indices gave you the last item without knowing the length. Slicing gave you subsets without loops. And it all worked for 3 items or 3 million. That moment when you realized customers[-1] always gives you the most recent entry, regardless of list size — that was the shift from thinking in fixed cells to thinking in dynamic collections.
On Day 19, your lists learned to change. .append() added customers as they signed up. .remove() took them off when they cancelled. .sort() put sales figures in order for your report. And then list comprehensions collapsed four lines of loop-and-append into one clean expression: [s for s in scores if s >= 60]. The same filtering you'd do with a spreadsheet autofilter, but programmable, repeatable, and composable. You went from the accumulator pattern of Week 2 to a one-liner that does the same work.
On Day 20, you stopped looking up data by position and started looking it up by name. customer["email"] instead of customer[1]. That's not just convenience — it's clarity. A dictionary is a lookup table, and it made your code read like English. You learned that .get() is the safe way to access data that might not exist — because in the real world, data is always messier than you expect. And you learned that the same syntax creates and updates keys, because Python doesn't care whether you're filling in a blank or correcting a mistake.
On Day 21, you looped through dictionaries — and that's when the real power showed up. The .items() loop let you walk through key-value pairs like reading rows in a spreadsheet. The counting pattern — counts[word] = counts.get(word, 0) + 1 — turned a list of raw data into a frequency table in three lines. Nested dictionaries let you model structured records: a customer who has a name, an email, and a purchase total, all in one object. You built a mini-database without a database.
Here's the arc of this week, at a level above the syntax: you went from one-at-a-time to collections. From fixed to flexible. From "I'll type the data into my code" to "my code handles whatever data shows up." A list holds any number of items. A dictionary gives any item a meaningful name. Together, they're how every real Python program represents data — from a simple script processing your team's expense reports to a web application serving millions of users.
But here's the thing you've probably started wondering about. All the data you've worked with still lives inside your Python code. You typed it. You defined it. What about the spreadsheets on your desktop? The CSV files your team emails around? The data that lives in files?
That's Week 4. Your data is about to come from the real world.
Practice your skills
Sign up to write and run code in this lesson.
Data at Scale
A look back at Week 3 — how lists and dictionaries turned one-at-a-time scripts into programs that handle real data.
Data at Scale
Three weeks ago, you didn't know what def meant.
Two weeks ago, you wrote functions that took input, made decisions, and looped through data. Your programs could think — evaluate conditions, filter results, accumulate totals. But every piece of data lived inside your code. A list of five names. A single score. One customer at a time.
This week, that changed completely.
On Day 17, you went back to numbers and saw the type system for what it really is — not a technicality, but a contract between your data and your code. int and float aren't just labels. They determine what happens when you divide, when you round, when you compare. You learned that 10 / 2 gives you 5.0 (a float, not an integer), that int() truncates while round() rounds properly, and that None isn't zero — it's the absence of a value. Like an empty cell in your spreadsheet, not a cell containing 0.
On Day 18, you discovered lists — and the scale problem vanished. Instead of customer1, customer2, customer3 stretching into absurdity, you had one name and any number of items: customers = ["Alice", "Bob", "Charlie", ...]. Indexing gave you random access. Negative indices gave you the last item without knowing the length. Slicing gave you subsets without loops. And it all worked for 3 items or 3 million. That moment when you realized customers[-1] always gives you the most recent entry, regardless of list size — that was the shift from thinking in fixed cells to thinking in dynamic collections.
On Day 19, your lists learned to change. .append() added customers as they signed up. .remove() took them off when they cancelled. .sort() put sales figures in order for your report. And then list comprehensions collapsed four lines of loop-and-append into one clean expression: [s for s in scores if s >= 60]. The same filtering you'd do with a spreadsheet autofilter, but programmable, repeatable, and composable. You went from the accumulator pattern of Week 2 to a one-liner that does the same work.
On Day 20, you stopped looking up data by position and started looking it up by name. customer["email"] instead of customer[1]. That's not just convenience — it's clarity. A dictionary is a lookup table, and it made your code read like English. You learned that .get() is the safe way to access data that might not exist — because in the real world, data is always messier than you expect. And you learned that the same syntax creates and updates keys, because Python doesn't care whether you're filling in a blank or correcting a mistake.
On Day 21, you looped through dictionaries — and that's when the real power showed up. The .items() loop let you walk through key-value pairs like reading rows in a spreadsheet. The counting pattern — counts[word] = counts.get(word, 0) + 1 — turned a list of raw data into a frequency table in three lines. Nested dictionaries let you model structured records: a customer who has a name, an email, and a purchase total, all in one object. You built a mini-database without a database.
Here's the arc of this week, at a level above the syntax: you went from one-at-a-time to collections. From fixed to flexible. From "I'll type the data into my code" to "my code handles whatever data shows up." A list holds any number of items. A dictionary gives any item a meaningful name. Together, they're how every real Python program represents data — from a simple script processing your team's expense reports to a web application serving millions of users.
But here's the thing you've probably started wondering about. All the data you've worked with still lives inside your Python code. You typed it. You defined it. What about the spreadsheets on your desktop? The CSV files your team emails around? The data that lives in files?
That's Week 4. Your data is about to come from the real world.
Practice your skills
Sign up to write and run code in this lesson.