I ran cProfile on your inventory export last night. Sorted by cumulative time. The top entry was compute_restock_threshold. 41,000 calls. 700 microseconds each.
That function is three lines. I assumed it was basically free. I have been optimizing the database loop for two weeks.
The database loop takes 0.8 seconds. compute_restock_threshold takes 29 seconds. You have been making the fast thing faster. The slow thing you never touched because you never measured it.
Same inputs produce the same output every time — it is a pure function. So every time I call it with the same SKU and the same lead-time parameters, I am recomputing a result I already have.
For your top twenty SKUs, you compute the same threshold between 800 and 2,000 times each. The fix is one line:
from functools import lru_cache
@lru_cache(maxsize=128)
def compute_restock_threshold(sku: str, lead_time: int, demand_rate: float) -> float:
# expensive calculation — now runs once per unique input set
...The decorator stores results keyed by the argument tuple. First call computes. Every subsequent call with the same arguments returns the stored result. maxsize=128 means at most 128 distinct input combinations are held. LRU eviction drops the least-recently-used entry when the cache is full.
How do I verify it is actually working? I want to see the cache hits, not assume the decorator is doing something.
lru_cache attaches cache_info() to the decorated function:
print(compute_restock_threshold.cache_info())
# CacheInfo(hits=40953, misses=47, maxsize=128, currsize=47)40,953 hits, 47 misses — 99.9% of calls returned cached results. cache_clear() resets it between test runs. You are running a measurement on your cache, the same way cProfile ran a measurement on your function.
What are the constraints? I have another function that takes a dict of parameters. Would that break?
Immediately — TypeError: unhashable type: 'dict'. The cache keys the arguments as a tuple, and tuple elements must be hashable. For dict arguments, flatten to a sorted tuple of items before the cache boundary:
@lru_cache(maxsize=128)
def _compute_cached(param_tuple: tuple) -> float:
params = dict(param_tuple)
return _actual_computation(params)
def compute_with_dict(params: dict) -> float:
return _compute_cached(tuple(sorted(params.items())))The public function converts the unhashable dict to a hashable tuple. The private cached function does the work. The cache never sees the dict.
I also have a shipping rate function — eleven elif branches, one for each zone. Different problem, but I wonder if the same dict-as-lookup pattern applies.
Exactly the same instinct. That elif chain evaluates conditions in order — four comparisons for branch four, eleven for branch eleven. O(n) in the number of branches. The dict version:
_SHIPPING_RATES = {
"domestic": 4.99,
"canada": 12.99,
"europe": 24.99,
"asia": 31.99,
}
def get_shipping_rate(zone: str) -> float:
return _SHIPPING_RATES.get(zone, 49.99)One hash lookup, constant time regardless of zone count. And when you add zone twelve, you add one line to the dict, not one more elif at the bottom.
The elif chain buries data inside control flow. The dict makes it a table — I can load it from config, override it in tests, inspect it without running the function.
That is the better argument, actually. The performance win is real but small for eleven branches. The structural win is permanent — data separated from logic is testable and changeable independently. When the branches are callables instead of values, it is a dispatch table:
_HANDLERS = {
"domestic": handle_domestic,
"canada": validate_export,
"europe": call_eu_service,
}
def process_order(order, zone: str):
handler = _HANDLERS.get(zone)
if handler is None:
raise ValueError(f"Unknown zone: {zone}")
return handler(order)Adding a zone is one dict entry and one function. No elif ordering to maintain.
I have eleven elif branches and seven more in the order status formatter. Two dispatch tables waiting to be written.
And three changes total — the lru_cache on compute_restock_threshold, the shipping rate dict, the status formatter dict. Maybe forty lines. The thirty-second script becomes two or three seconds.
Without the cProfile run, I would have kept optimizing the 0.8-second database loop.
You would have made it perfect and your script would still take 29 seconds. This is why measuring before optimizing is not a best practice. It is the prerequisite. You were thorough. You were thorough in the wrong place. cProfile tells you where to be thorough.
cache_info() after the fix. Dispatch tables instead of elif chains. And cProfile to find the next thing I am being thorough about in the wrong place.
That sentence took you four weeks to be able to say. Come back when you have run the fix and read the cache info output.
lru_cache Works: Cache Keys, LRU Eviction, and the Dispatch Table PatternCache key construction. lru_cache wraps the decorated function in a C-level closure that maintains an ordered dictionary. When the wrapper is called, it constructs a cache key from the positional arguments tuple concatenated with a flattened representation of keyword arguments. The key is stored as a tuple of alternating (kwd_mark, key, value) pairs for kwargs, ensuring that f(a=1, b=2) and f(b=2, a=1) hash to the same key. This construction happens in _make_key() in functools.py. The key is then looked up in the cache dict using normal Python dict hashing — the O(1) average-case property of dict lookup is what makes cache hits fast.
LRU eviction mechanics. The cache is implemented as a circular doubly-linked list embedded in the ordered dict entries. Each cache entry has prev and next pointers. On a cache hit, the entry is moved to the front of the list (most recently used). When the cache is full and a new entry arrives, the entry at the back of the list (least recently used) is evicted. This gives O(1) insertion, O(1) lookup, and O(1) eviction — the full LRU contract without ever scanning the entire cache. maxsize=None disables the linked-list maintenance entirely and uses a plain dict, which is why lru_cache(maxsize=None) is slightly faster per call than a bounded cache.
When caching is incorrect. lru_cache is only correct for pure functions — same inputs always produce the same output, no observable side effects. Caching a function that reads from a database, calls a network service, depends on datetime.now(), or modifies shared state will produce stale results silently. The cache has no expiry by default; once a result is stored, it is returned forever until cache_clear() is called or the program exits. For functions with time-based or state-dependent outputs, use cachetools.TTLCache (external library) or manage your own dict cache with explicit invalidation.
The dispatch table as a design pattern. The dict-of-callables pattern — mapping string keys to functions — is the Python equivalent of a strategy pattern or a jump table. It has three advantages over elif chains. First, O(1) dispatch regardless of the number of cases, because dict lookup is hash-based. Second, the table is a first-class data structure: it can be extended at runtime, serialized, inspected, and overridden in tests by swapping entries. Third, it enforces a uniform interface — all handlers must accept the same arguments, which becomes an implicit contract. The elif version permits each branch to have a different shape, making refactoring harder. The dispatch table makes the contract explicit.
Sign up to write and run code in this lesson.
I ran cProfile on your inventory export last night. Sorted by cumulative time. The top entry was compute_restock_threshold. 41,000 calls. 700 microseconds each.
That function is three lines. I assumed it was basically free. I have been optimizing the database loop for two weeks.
The database loop takes 0.8 seconds. compute_restock_threshold takes 29 seconds. You have been making the fast thing faster. The slow thing you never touched because you never measured it.
Same inputs produce the same output every time — it is a pure function. So every time I call it with the same SKU and the same lead-time parameters, I am recomputing a result I already have.
For your top twenty SKUs, you compute the same threshold between 800 and 2,000 times each. The fix is one line:
from functools import lru_cache
@lru_cache(maxsize=128)
def compute_restock_threshold(sku: str, lead_time: int, demand_rate: float) -> float:
# expensive calculation — now runs once per unique input set
...The decorator stores results keyed by the argument tuple. First call computes. Every subsequent call with the same arguments returns the stored result. maxsize=128 means at most 128 distinct input combinations are held. LRU eviction drops the least-recently-used entry when the cache is full.
How do I verify it is actually working? I want to see the cache hits, not assume the decorator is doing something.
lru_cache attaches cache_info() to the decorated function:
print(compute_restock_threshold.cache_info())
# CacheInfo(hits=40953, misses=47, maxsize=128, currsize=47)40,953 hits, 47 misses — 99.9% of calls returned cached results. cache_clear() resets it between test runs. You are running a measurement on your cache, the same way cProfile ran a measurement on your function.
What are the constraints? I have another function that takes a dict of parameters. Would that break?
Immediately — TypeError: unhashable type: 'dict'. The cache keys the arguments as a tuple, and tuple elements must be hashable. For dict arguments, flatten to a sorted tuple of items before the cache boundary:
@lru_cache(maxsize=128)
def _compute_cached(param_tuple: tuple) -> float:
params = dict(param_tuple)
return _actual_computation(params)
def compute_with_dict(params: dict) -> float:
return _compute_cached(tuple(sorted(params.items())))The public function converts the unhashable dict to a hashable tuple. The private cached function does the work. The cache never sees the dict.
I also have a shipping rate function — eleven elif branches, one for each zone. Different problem, but I wonder if the same dict-as-lookup pattern applies.
Exactly the same instinct. That elif chain evaluates conditions in order — four comparisons for branch four, eleven for branch eleven. O(n) in the number of branches. The dict version:
_SHIPPING_RATES = {
"domestic": 4.99,
"canada": 12.99,
"europe": 24.99,
"asia": 31.99,
}
def get_shipping_rate(zone: str) -> float:
return _SHIPPING_RATES.get(zone, 49.99)One hash lookup, constant time regardless of zone count. And when you add zone twelve, you add one line to the dict, not one more elif at the bottom.
The elif chain buries data inside control flow. The dict makes it a table — I can load it from config, override it in tests, inspect it without running the function.
That is the better argument, actually. The performance win is real but small for eleven branches. The structural win is permanent — data separated from logic is testable and changeable independently. When the branches are callables instead of values, it is a dispatch table:
_HANDLERS = {
"domestic": handle_domestic,
"canada": validate_export,
"europe": call_eu_service,
}
def process_order(order, zone: str):
handler = _HANDLERS.get(zone)
if handler is None:
raise ValueError(f"Unknown zone: {zone}")
return handler(order)Adding a zone is one dict entry and one function. No elif ordering to maintain.
I have eleven elif branches and seven more in the order status formatter. Two dispatch tables waiting to be written.
And three changes total — the lru_cache on compute_restock_threshold, the shipping rate dict, the status formatter dict. Maybe forty lines. The thirty-second script becomes two or three seconds.
Without the cProfile run, I would have kept optimizing the 0.8-second database loop.
You would have made it perfect and your script would still take 29 seconds. This is why measuring before optimizing is not a best practice. It is the prerequisite. You were thorough. You were thorough in the wrong place. cProfile tells you where to be thorough.
cache_info() after the fix. Dispatch tables instead of elif chains. And cProfile to find the next thing I am being thorough about in the wrong place.
That sentence took you four weeks to be able to say. Come back when you have run the fix and read the cache info output.
lru_cache Works: Cache Keys, LRU Eviction, and the Dispatch Table PatternCache key construction. lru_cache wraps the decorated function in a C-level closure that maintains an ordered dictionary. When the wrapper is called, it constructs a cache key from the positional arguments tuple concatenated with a flattened representation of keyword arguments. The key is stored as a tuple of alternating (kwd_mark, key, value) pairs for kwargs, ensuring that f(a=1, b=2) and f(b=2, a=1) hash to the same key. This construction happens in _make_key() in functools.py. The key is then looked up in the cache dict using normal Python dict hashing — the O(1) average-case property of dict lookup is what makes cache hits fast.
LRU eviction mechanics. The cache is implemented as a circular doubly-linked list embedded in the ordered dict entries. Each cache entry has prev and next pointers. On a cache hit, the entry is moved to the front of the list (most recently used). When the cache is full and a new entry arrives, the entry at the back of the list (least recently used) is evicted. This gives O(1) insertion, O(1) lookup, and O(1) eviction — the full LRU contract without ever scanning the entire cache. maxsize=None disables the linked-list maintenance entirely and uses a plain dict, which is why lru_cache(maxsize=None) is slightly faster per call than a bounded cache.
When caching is incorrect. lru_cache is only correct for pure functions — same inputs always produce the same output, no observable side effects. Caching a function that reads from a database, calls a network service, depends on datetime.now(), or modifies shared state will produce stale results silently. The cache has no expiry by default; once a result is stored, it is returned forever until cache_clear() is called or the program exits. For functions with time-based or state-dependent outputs, use cachetools.TTLCache (external library) or manage your own dict cache with explicit invalidation.
The dispatch table as a design pattern. The dict-of-callables pattern — mapping string keys to functions — is the Python equivalent of a strategy pattern or a jump table. It has three advantages over elif chains. First, O(1) dispatch regardless of the number of cases, because dict lookup is hash-based. Second, the table is a first-class data structure: it can be extended at runtime, serialized, inspected, and overridden in tests by swapping entries. Third, it enforces a uniform interface — all handlers must accept the same arguments, which becomes an implicit contract. The elif version permits each branch to have a different shape, making refactoring harder. The dispatch table makes the contract explicit.