You want every IP address in a block of log text. Could be one, could be fifty. re.search tells you yes or no. What tool gives you all the matches?
Something that returns a list. So I can loop over the matches and collect them somewhere.
re.findall(pattern, text) returns a list of every non-overlapping match. For IPs, the pattern has four groups of 1–3 digits separated by literal dots:
import re
pattern = r"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b"
matches = re.findall(pattern, text)\d{1,3} means "one to three digits." \. is a literal dot — the backslash escapes it because a bare . would match any character.
Why the \b at the start and end?
Word boundaries. Without them, a longer string like 1234.5.6.7.8 could produce partial matches at weird offsets. \b locks the match to start and end at a transition between a word character and a non-word character. Every IP match is anchored.
findall gives me raw matches though — duplicates and all. How do I get unique IPs in sorted order?
Three tools, one line:
unique_sorted = sorted(set(matches))set(...) drops duplicates. sorted(...) gives a stable, deterministic order. The pipeline findall → set → sorted is the standard recipe whenever you extract repeated values from text.
So the whole function is: compile the pattern, findall, set, sorted. The shape works for emails, file paths, error codes — anything with a pattern.
Swap the pattern and the recipe carries. Learn the shape once, apply it everywhere.
re.findall + dedup recipeTL;DR: sorted(set(re.findall(pattern, text))) gives unique matches in stable order.
re.findall — list of every non-overlapping matchset(list) — drop duplicates (O(n) hash lookup)sorted(set) — deterministic ordering| Piece | Meaning |
|---|---|
\b | word boundary |
\d{1,3} | 1–3 digits |
\. | literal dot |
\b...\b | locks both ends |
The pattern matches the shape of an IP, not its validity — 999.999.999.999 would match.
You want every IP address in a block of log text. Could be one, could be fifty. re.search tells you yes or no. What tool gives you all the matches?
Something that returns a list. So I can loop over the matches and collect them somewhere.
re.findall(pattern, text) returns a list of every non-overlapping match. For IPs, the pattern has four groups of 1–3 digits separated by literal dots:
import re
pattern = r"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b"
matches = re.findall(pattern, text)\d{1,3} means "one to three digits." \. is a literal dot — the backslash escapes it because a bare . would match any character.
Why the \b at the start and end?
Word boundaries. Without them, a longer string like 1234.5.6.7.8 could produce partial matches at weird offsets. \b locks the match to start and end at a transition between a word character and a non-word character. Every IP match is anchored.
findall gives me raw matches though — duplicates and all. How do I get unique IPs in sorted order?
Three tools, one line:
unique_sorted = sorted(set(matches))set(...) drops duplicates. sorted(...) gives a stable, deterministic order. The pipeline findall → set → sorted is the standard recipe whenever you extract repeated values from text.
So the whole function is: compile the pattern, findall, set, sorted. The shape works for emails, file paths, error codes — anything with a pattern.
Swap the pattern and the recipe carries. Learn the shape once, apply it everywhere.
re.findall + dedup recipeTL;DR: sorted(set(re.findall(pattern, text))) gives unique matches in stable order.
re.findall — list of every non-overlapping matchset(list) — drop duplicates (O(n) hash lookup)sorted(set) — deterministic ordering| Piece | Meaning |
|---|---|
\b | word boundary |
\d{1,3} | 1–3 digits |
\. | literal dot |
\b...\b | locks both ends |
The pattern matches the shape of an IP, not its validity — 999.999.999.999 would match.
Write `extract_ip_addresses(text)` that returns a sorted list of unique IPv4 addresses found anywhere in the input string. Use `re.findall` with the pattern `r"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b"`, then dedupe and sort.
Tap each step for scaffolded hints.
No blank-editor panic.