Day 3 · ~15m

Python Data Model: __len__, __getitem__, and How Python Talks to Your Objects

Dunder methods are Python's backstage machinery. Learn how __len__, __getitem__, __iter__, and __contains__ let your objects respond to len(), [], for, and in.

student (thinking)

I just got assigned to the order processing pipeline — Amir handed it off before he left for that conference. There's a class called OrderBatch and it doesn't work with len(). My first instinct was to add a .count() method, but Amir left a comment: "make it work with len()". I don't even know what that means.

teacher (serious)

It means Amir wants the built-in len() function to work on your object directly. len(batch) instead of batch.count(). Those two look similar but they're completely different things.

student (curious)

What's the difference? If they both return the count, why does it matter which one I use?

teacher (neutral)

Because len() is part of Python's data model. When you call len(batch), Python doesn't look for a method named len on your object. It looks for __len__. That double-underscore method is how Python's built-in machinery talks to your objects. Every built-in operation routes through a specific dunder method — and len() is just the public entrance. __len__ is the stage machinery behind the wall.

student (surprised)

So when I've been calling len(my_list) for eighteen months, Python has been calling list.__len__(my_list) the whole time?

teacher (amused)

Every single time. You've been using the theatre's front door without ever wondering what's backstage. Try it yourself:

orders = ["ORD-001", "ORD-002", "ORD-003"]
print(len(orders))           # 3
print(orders.__len__())      # 3  — same thing

Those two calls are identical. len(orders) is syntactic sugar that dispatches to orders.__len__(). Python's interpreter is doing that translation for you invisibly.

student (focused)

Okay. So if I define __len__ on my OrderBatch class, then len(batch) will call it?

teacher (encouraging)

Exactly. And it's the same pattern for everything else that feels "built-in". Square bracket access — batch[0] — routes through __getitem__. The for loop asks for __iter__. The in operator uses __contains__. These aren't magic. They're a documented protocol. Once you know the protocol, you can make any object behave like a native Python sequence.

Here's the minimal version:

class OrderBatch:
    def __init__(self, orders):
        self._orders = orders

    def __len__(self):
        return len(self._orders)

batch = OrderBatch([
    {"id": "ORD-001", "customer": "Priya", "total": 89.99, "status": "pending"},
    {"id": "ORD-002", "customer": "Amir",  "total": 124.50, "status": "shipped"},
])

print(len(batch))  # 2

Notice: __len__ delegates to the underlying list's len(). You're not reimplementing counting — you're exposing the count through the protocol.

student (confused)

But if I try batch[0] it'll still fail, right? I need a separate method for that?

teacher (focused)

Right. __getitem__ is separate. Each operator has its own dunder. batch[0] raises a TypeError until you implement __getitem__:

class OrderBatch:
    def __init__(self, orders):
        self._orders = orders

    def __len__(self):
        return len(self._orders)

    def __getitem__(self, index):
        return self._orders[index]

batch = OrderBatch([
    {"id": "ORD-001", "customer": "Priya", "total": 89.99, "status": "pending"},
    {"id": "ORD-002", "customer": "Amir",  "total": 124.50, "status": "shipped"},
])

print(batch[0]["id"])   # ORD-001
print(batch[-1]["id"])  # ORD-002 — negative indexing works for free

And here's something that surprises most people: once you have __getitem__, you get for loops for free. Python will call __getitem__ with indices 0, 1, 2... until it hits an IndexError. No __iter__ needed — yet.

student (excited)

Wait. So for order in batch: already works after I implement __getitem__? Without writing __iter__?

teacher (neutral)

It does — via Python's fallback iteration protocol. If __iter__ is absent, Python falls back to calling __getitem__ with sequential integers. But there's a reason to implement __iter__ explicitly anyway: it's faster, it handles non-integer keys, and it signals intent. Here's the explicit version:

class OrderBatch:
    def __init__(self, orders):
        self._orders = orders

    def __len__(self):
        return len(self._orders)

    def __getitem__(self, index):
        return self._orders[index]

    def __iter__(self):
        return iter(self._orders)

    def __contains__(self, order_id):
        return any(o["id"] == order_id for o in self._orders)

batch = OrderBatch([
    {"id": "ORD-001", "customer": "Priya", "total": 89.99, "status": "pending"},
    {"id": "ORD-002", "customer": "Amir",  "total": 124.50, "status": "shipped"},
])

for order in batch:
    print(order["id"])   # ORD-001, then ORD-002

print("ORD-001" in batch)  # True
print("ORD-999" in batch)  # False
student (curious)

The __contains__ is checking by order ID, not by the dict object itself. Is that the right call?

teacher (serious)

That's the design decision — and it's exactly right for a domain object. For a plain list, in checks object equality. For an OrderBatch, it makes far more sense to check whether an order ID exists in the batch. The data model gives you the hook; you decide what the semantics are. That's the whole point of implementing dunders yourself rather than just using a raw list.

student (proud)

This is what I've been missing. I've been writing classes that have methods like get_count() and find_by_id() when I could have been making them speak Python's native language. The codebase has so many of these.

teacher (encouraging)

That's the shift. Once you implement the protocol, your objects work with len(), list comprehensions, sorted(), zip(), enumerate() — all of it. Not because you wrote special cases for each one, but because Python's built-in machinery all speaks the same protocol. Your next step is understanding how deep that machinery goes — and tomorrow we'll look at descriptors, which are how Python handles attribute access itself. You think batch.orders is simple. It isn't.