If the same project name lands in your inbox, on your calendar, and in your task list, a naive combined list repeats it three times. How do you keep first-seen order and drop the rest?
I can think of a for-loop with a seen set, but is there a one-liner?
list(dict.fromkeys(combined)) is the one-liner. Dict keys are unique and ordered in Python 3.7+, so this preserves first-seen order while dropping duplicates:
unique = list(dict.fromkeys(combined))For real dedup pipelines you want something more explicit — a set tracking what you have emitted and a list keeping the order, right?
Yes. The set version reads more clearly and extends easily to fuzzy matching later. Here's the function with the explicit pattern:
def dedup_across_sources(max_items: int) -> list:
emails = toolset.execute_action(Action.GMAIL_FETCH_EMAILS, {"max_results": max_items})
events = toolset.execute_action(Action.GOOGLECALENDAR_FIND_EVENT, {"query": ""})
tasks = toolset.execute_action(Action.GOOGLETASKS_LIST_TASKS, {"max_results": max_items})
combined = (
[m.get("snippet", "") for m in emails.get("messages", [])]
+ [e.get("summary", "") for e in events.get("items", [])]
+ [t.get("title", "") for t in tasks.get("items", [])]
)
seen = set()
unique = []
for item in combined:
if item not in seen:
seen.add(item)
unique.append(item)
print(f"Deduped {len(combined)} items down to {len(unique)}")
return uniqueSets have O(1) lookup — so checking item not in seen scales even when the combined list has thousands of entries?
Exactly. The set is the speed; the list is the order. That combination — fast membership plus preserved order — is the production shape of dedup at any scale.
So the printed number tells me how much duplication actually exists across my live inbox, calendar, and tasks?
Real data, real dedup count. Sets are the right tool whenever you need to ask 'have I seen this already?' in constant time.
TL;DR: A set for membership, a list for order. The two together dedup any list in one pass.
seen set — O(1) membership test, even for thousands of itemsunique list — preserves first-seen orderdict.fromkeys| Approach | When to use |
|---|---|
list(dict.fromkeys(items)) | one-liner, no extra condition |
seen = set(); unique = [] loop | when you add filter / transform logic later |
If the same project name lands in your inbox, on your calendar, and in your task list, a naive combined list repeats it three times. How do you keep first-seen order and drop the rest?
I can think of a for-loop with a seen set, but is there a one-liner?
list(dict.fromkeys(combined)) is the one-liner. Dict keys are unique and ordered in Python 3.7+, so this preserves first-seen order while dropping duplicates:
unique = list(dict.fromkeys(combined))For real dedup pipelines you want something more explicit — a set tracking what you have emitted and a list keeping the order, right?
Yes. The set version reads more clearly and extends easily to fuzzy matching later. Here's the function with the explicit pattern:
def dedup_across_sources(max_items: int) -> list:
emails = toolset.execute_action(Action.GMAIL_FETCH_EMAILS, {"max_results": max_items})
events = toolset.execute_action(Action.GOOGLECALENDAR_FIND_EVENT, {"query": ""})
tasks = toolset.execute_action(Action.GOOGLETASKS_LIST_TASKS, {"max_results": max_items})
combined = (
[m.get("snippet", "") for m in emails.get("messages", [])]
+ [e.get("summary", "") for e in events.get("items", [])]
+ [t.get("title", "") for t in tasks.get("items", [])]
)
seen = set()
unique = []
for item in combined:
if item not in seen:
seen.add(item)
unique.append(item)
print(f"Deduped {len(combined)} items down to {len(unique)}")
return uniqueSets have O(1) lookup — so checking item not in seen scales even when the combined list has thousands of entries?
Exactly. The set is the speed; the list is the order. That combination — fast membership plus preserved order — is the production shape of dedup at any scale.
So the printed number tells me how much duplication actually exists across my live inbox, calendar, and tasks?
Real data, real dedup count. Sets are the right tool whenever you need to ask 'have I seen this already?' in constant time.
TL;DR: A set for membership, a list for order. The two together dedup any list in one pass.
seen set — O(1) membership test, even for thousands of itemsunique list — preserves first-seen orderdict.fromkeys| Approach | When to use |
|---|---|
list(dict.fromkeys(items)) | one-liner, no extra condition |
seen = set(); unique = [] loop | when you add filter / transform logic later |
Create a free account to get started. Paid plans unlock all tracks.