The ops team sent over a batch of log file references — path strings from their monitoring agent. Before I show you pathlib, tell me how you'd currently get the filename without its extension from /var/log/services/auth/2026-03-31.log.
I know this one. os.path.splitext(os.path.basename(path_string))[0]. I have that chain memorized. I hate typing it every time.
You hate it because you're right to hate it. Here's what you've been waiting for since Track 1:
from pathlib import Path
p = Path("/var/log/services/auth/2026-03-31.log")
print(p.name) # 2026-03-31.log
print(p.stem) # 2026-03-31
print(p.suffix) # .log
print(p.parent) # /var/log/services/auth.stem gives me the filename without the extension. No split, no slice, no os.path anything. I've been writing os.path.splitext(os.path.basename(path_string))[0] for two years.
Two years. The analogy I use: pathlib is GPS navigation for your filesystem. os.path is a paper map with manual turn counting. pathlib knows the map — you say "what's the stem?" and it handles everything. And notice: creating a Path from a string doesn't touch the filesystem. The file doesn't need to exist. It's pure path-string analysis.
Which is exactly what I need. The ops team's monitoring agent sends me path strings from remote servers I can't reach from my machine. I just need to analyze the structure — extensions, parent directories, stems. Not open anything.
Perfect use case. And for compressed logs with compound extensions:
from pathlib import Path
p = Path("/var/log/nginx/access.log.gz")
print(p.suffix) # .gz
print(p.suffixes) # ['.log', '.gz']
print(p.stem) # access.log.suffix gives the last extension. .suffixes gives all of them as a list. .stem strips only the final one, so access.log retains its date or type context.
I'm already seeing how this fits the problem. The ops team sends a mixed batch — .log, .log.gz, .json, .csv. I want to group them by extension. With pathlib I just create a Path from each string and call .suffix. No edge cases for files with dots in the name.
And the parent path is a Path object too — call str() or .as_posix() when you need a plain string for dict keys:
from pathlib import Path
paths = [
"/var/log/services/auth/2026-03-31.log",
"/var/log/services/api/2026-03-30.json",
]
for s in paths:
p = Path(s)
print(p.suffix, str(p.parent), p.stem)Three attributes, one loop, and I have everything I need to build the summary dict. Unique extensions from .suffix, unique parent directories from str(p.parent), stems from .stem. Everything I learned in Track 2 about dict accumulation applies directly.
The pathlib attributes are new. The data structures are yours. That combination — stdlib module handles the format, your Python handles the logic — is the whole game.
I spent two years doing os.path.splitext(os.path.basename(x))[0] when .stem was sitting right here. I'm not angry. I'm at peace. I have found the Path.
Tomorrow you'll meet os and sys — pathlib's older, less friendly cousins. Environment variables, command-line arguments, process management. If pathlib is GPS navigation, os and sys are the building's control panel — thermostats, intercoms, emergency stops. Less elegant, but unavoidable when your scripts interact with the world outside Python.
pathlib was added to Python 3.4 to replace the fragmented, string-based path manipulation that had accumulated across os, os.path, and user code. The key insight behind its design: a file path is not a string. It is a structured object with a location, a name, a stem, a suffix, and relationships to other paths. Treating it as a plain string means manually re-parsing that structure every time you need any of it.
Creating a Path object does not touch the filesystem. Path("/var/log/auth.log") on a machine that has no /var/log/ directory raises no error — it simply represents the path as an object. Filesystem access only happens when you call methods like .exists(), .stat(), .iterdir(), or .open(). This makes pathlib ideal for analyzing path structures in data — log metadata, monitoring exports, file inventories — without needing access to the actual files.
The most-used attributes: .name (full filename with extension), .stem (filename without final extension), .suffix (final extension including the dot), .suffixes (list of all extensions), .parent (the containing directory as a Path), .parts (the path split into a tuple of components). All of these work on a Path constructed from any string — forward slashes, Windows paths on non-Windows machines (via PurePosixPath / PureWindowsPath), or relative paths.
pathlib overloads the / operator for path joining: Path("/var/log") / "services" / "auth.log" produces Path("/var/log/services/auth.log"). This replaces os.path.join(), which is both more verbose and less readable. The / operator only works when the left side is a Path object — "string" / Path("other") raises TypeError.
pathlib has two class hierarchies. PurePath (and its subclasses PurePosixPath, PureWindowsPath) provides all the string-manipulation attributes with zero filesystem access. Path (and its subclasses PosixPath, WindowsPath) inherits from PurePath and adds filesystem methods. For log metadata analysis where you never open files, PurePath is technically sufficient — though Path works fine since you're not calling filesystem methods.
The translation table is short: os.path.basename(p) → Path(p).name. os.path.dirname(p) → str(Path(p).parent). os.path.splitext(p) → (Path(p).stem, Path(p).suffix). os.path.join(a, b) → Path(a) / b. os.path.exists(p) → Path(p).exists(). In new code, always use pathlib. In old code, migrate opportunistically — each replacement simplifies the code.
Sign up to write and run code in this lesson.
The ops team sent over a batch of log file references — path strings from their monitoring agent. Before I show you pathlib, tell me how you'd currently get the filename without its extension from /var/log/services/auth/2026-03-31.log.
I know this one. os.path.splitext(os.path.basename(path_string))[0]. I have that chain memorized. I hate typing it every time.
You hate it because you're right to hate it. Here's what you've been waiting for since Track 1:
from pathlib import Path
p = Path("/var/log/services/auth/2026-03-31.log")
print(p.name) # 2026-03-31.log
print(p.stem) # 2026-03-31
print(p.suffix) # .log
print(p.parent) # /var/log/services/auth.stem gives me the filename without the extension. No split, no slice, no os.path anything. I've been writing os.path.splitext(os.path.basename(path_string))[0] for two years.
Two years. The analogy I use: pathlib is GPS navigation for your filesystem. os.path is a paper map with manual turn counting. pathlib knows the map — you say "what's the stem?" and it handles everything. And notice: creating a Path from a string doesn't touch the filesystem. The file doesn't need to exist. It's pure path-string analysis.
Which is exactly what I need. The ops team's monitoring agent sends me path strings from remote servers I can't reach from my machine. I just need to analyze the structure — extensions, parent directories, stems. Not open anything.
Perfect use case. And for compressed logs with compound extensions:
from pathlib import Path
p = Path("/var/log/nginx/access.log.gz")
print(p.suffix) # .gz
print(p.suffixes) # ['.log', '.gz']
print(p.stem) # access.log.suffix gives the last extension. .suffixes gives all of them as a list. .stem strips only the final one, so access.log retains its date or type context.
I'm already seeing how this fits the problem. The ops team sends a mixed batch — .log, .log.gz, .json, .csv. I want to group them by extension. With pathlib I just create a Path from each string and call .suffix. No edge cases for files with dots in the name.
And the parent path is a Path object too — call str() or .as_posix() when you need a plain string for dict keys:
from pathlib import Path
paths = [
"/var/log/services/auth/2026-03-31.log",
"/var/log/services/api/2026-03-30.json",
]
for s in paths:
p = Path(s)
print(p.suffix, str(p.parent), p.stem)Three attributes, one loop, and I have everything I need to build the summary dict. Unique extensions from .suffix, unique parent directories from str(p.parent), stems from .stem. Everything I learned in Track 2 about dict accumulation applies directly.
The pathlib attributes are new. The data structures are yours. That combination — stdlib module handles the format, your Python handles the logic — is the whole game.
I spent two years doing os.path.splitext(os.path.basename(x))[0] when .stem was sitting right here. I'm not angry. I'm at peace. I have found the Path.
Tomorrow you'll meet os and sys — pathlib's older, less friendly cousins. Environment variables, command-line arguments, process management. If pathlib is GPS navigation, os and sys are the building's control panel — thermostats, intercoms, emergency stops. Less elegant, but unavoidable when your scripts interact with the world outside Python.
pathlib was added to Python 3.4 to replace the fragmented, string-based path manipulation that had accumulated across os, os.path, and user code. The key insight behind its design: a file path is not a string. It is a structured object with a location, a name, a stem, a suffix, and relationships to other paths. Treating it as a plain string means manually re-parsing that structure every time you need any of it.
Creating a Path object does not touch the filesystem. Path("/var/log/auth.log") on a machine that has no /var/log/ directory raises no error — it simply represents the path as an object. Filesystem access only happens when you call methods like .exists(), .stat(), .iterdir(), or .open(). This makes pathlib ideal for analyzing path structures in data — log metadata, monitoring exports, file inventories — without needing access to the actual files.
The most-used attributes: .name (full filename with extension), .stem (filename without final extension), .suffix (final extension including the dot), .suffixes (list of all extensions), .parent (the containing directory as a Path), .parts (the path split into a tuple of components). All of these work on a Path constructed from any string — forward slashes, Windows paths on non-Windows machines (via PurePosixPath / PureWindowsPath), or relative paths.
pathlib overloads the / operator for path joining: Path("/var/log") / "services" / "auth.log" produces Path("/var/log/services/auth.log"). This replaces os.path.join(), which is both more verbose and less readable. The / operator only works when the left side is a Path object — "string" / Path("other") raises TypeError.
pathlib has two class hierarchies. PurePath (and its subclasses PurePosixPath, PureWindowsPath) provides all the string-manipulation attributes with zero filesystem access. Path (and its subclasses PosixPath, WindowsPath) inherits from PurePath and adds filesystem methods. For log metadata analysis where you never open files, PurePath is technically sufficient — though Path works fine since you're not calling filesystem methods.
The translation table is short: os.path.basename(p) → Path(p).name. os.path.dirname(p) → str(Path(p).parent). os.path.splitext(p) → (Path(p).stem, Path(p).suffix). os.path.join(a, b) → Path(a) / b. os.path.exists(p) → Path(p).exists(). In new code, always use pathlib. In old code, migrate opportunistically — each replacement simplifies the code.