Python Files and File Handling
Master tuples, sets, and file I/O — the tools that let your Python programs work with real data.
2 modules · 8 lessons · free to read
What you'll learn
- ✓Use tuples and sets to organise and deduplicate data
- ✓Open, read, and write text files using Python's built-in file I/O
- ✓Process structured text files line by line and build data structures from file content
- ✓Handle file errors gracefully with try/except
- ✓Apply tuples, sets, and file handling together to solve real data problems
01Tuples and Sets
Discover two of Python's most useful but often overlooked data structures — tuples for ordered, immutable sequences and sets for unordered collections with no duplicates.
1.Tuples
A tuple is an ordered, immutable sequence of values — like a list that cannot be changed after it is created.
You create a tuple with parentheses and comma-separated values. Once created, you cannot add, remove, or replace any element. This immutability is not a limitation — it is a guarantee: when you pass a tuple to a function, you can be certain the data won't be modified.
pythoncoordinates = (40.7128, -74.0060) print(coordinates[0]) # 40.7128 print(coordinates[1]) # -74.0060
Indexing and slicing work exactly as they do for lists: t[0] for the first element, t[-1] for the last, t[1:3] for a slice.
A single-element tuple requires a trailing comma — otherwise Python treats the parentheses as grouping, not a tuple.
pythonnot_a_tuple = (42) # int, not tuple is_a_tuple = (42,) # tuple with one element
Packing and unpacking let you assign a tuple's elements to variables in one line. This is especially useful for returning multiple values from a function — Python bundles the values into a tuple automatically.
pythondef min_max(numbers): return min(numbers), max(numbers) # returns a tuple lo, hi = min_max([3, 1, 7, 2]) # unpacks it print(lo, hi) # 1 7
Use tuples whenever the data is a fixed collection — coordinates, RGB colours, a (key, value) pair — and a list when the collection will grow or change.
Constraints
- –Return a tuple with exactly two elements — the minimum first, then the maximum.
- –Use the `min()` and `max()` built-in functions.
- –Do not use `return` twice — pack both values into one `return` statement.
2.Tuple Patterns
Tuples show up constantly in Python's standard library — once you recognise the patterns, you'll use them everywhere.
zip() pairs elements from two lists into tuples, making it easy to process related data together. It stops at the shorter list, so both lists should be the same length.
pythonnames = ['Alice', 'Bob', 'Carol'] scores = [88, 92, 75] for name, score in zip(names, scores): print(f'{name}: {score}') # Alice: 88 # Bob: 92 # Carol: 75
enumerate() wraps any iterable and yields (index, value) tuples — the cleanest way to loop when you need both position and value.
pythonfor i, name in enumerate(['Alice', 'Bob', 'Carol']): print(f'{i}: {name}') # 0: Alice # 1: Bob # 2: Carol
Because tuples are immutable, Python can hash them — which means you can use a tuple as a dictionary key. Lists cannot be dictionary keys.
pythongrid = {} grid[(0, 0)] = 'start' grid[(3, 4)] = 'exit' print(grid[(0, 0)]) # 'start'
Finally, variable swapping with tuples eliminates the temporary variable. Python evaluates the right-hand side first, packs it into a tuple, then unpacks it on the left.
pythona, b = 10, 20 a, b = b, a print(a, b) # 20 10
Use zip() when combining two parallel lists, enumerate() when you need loop indices, tuple keys when your dictionary needs a composite key, and swap syntax whenever you'd otherwise reach for a temp variable.
Constraints
- –Use `zip()` to pair the two lists — do not use an index-based loop.
- –Return a `dict`, not a list of tuples.
- –Assume the two lists always have the same length.
3.Sets
A set is an unordered collection of unique values — no element can appear more than once, and there is no guaranteed order.
Create a set with curly braces {} when you know the values upfront, or with set() to convert another iterable. Be careful: {} alone creates an empty dict, not an empty set — use set() for an empty set.
pythoncolours = {'red', 'blue', 'green', 'red'} # duplicate 'red' is dropped print(colours) # {'blue', 'green', 'red'} — order may vary from_list = set([1, 2, 2, 3, 3, 3]) print(from_list) # {1, 2, 3} empty = set() # not {} — that's a dict
The most common reason to reach for a set is fast membership testing. in on a set is O(1) — it doesn't matter if the set has 10 elements or 10 million.
pythonallowed = {'admin', 'editor', 'viewer'} role = 'editor' print(role in allowed) # True
Sets are mutable: you can add and remove elements after creation.
pythontags = {'python', 'beginner'} tags.add('files') # adds 'files' tags.discard('missing') # safe — no error if element absent tags.remove('beginner') # raises KeyError if absent — be certain it's there print(tags) # {'python', 'files'}
Use .discard() when you're not sure the element exists; use .remove() only when you are sure — it raises a KeyError otherwise.
Use a set whenever you need unique values, fast membership testing, or want to remove duplicates from a list.
Constraints
- –Use `set()` to remove duplicates — do not use a loop to build the result manually.
- –Return a sorted `list`, not a set — use `sorted()` to produce a predictable order.
4.Set Operations
Set operations let you compare and combine sets using simple operators — the same symbols used in mathematics.
Given two sets a and b:
a | b— union: all elements in either seta & b— intersection: only elements in both setsa - b— difference: elements inabut not inba ^ b— symmetric difference: elements in one set but not both
pythonpython_learners = {'Alice', 'Bob', 'Carol'} js_learners = {'Bob', 'Carol', 'Dave'} print(python_learners | js_learners) # {'Alice', 'Bob', 'Carol', 'Dave'} print(python_learners & js_learners) # {'Bob', 'Carol'} print(python_learners - js_learners) # {'Alice'} print(python_learners ^ js_learners) # {'Alice', 'Dave'}
One of the most practical applications of sets is removing duplicates from a list. Converting a list to a set drops all duplicates; converting back to a list gives you a deduplicated list.
pythonvisits = ['home', 'about', 'home', 'contact', 'about', 'home'] unique_pages = list(set(visits)) print(unique_pages) # ['about', 'contact', 'home'] — order not guaranteed
Note that sets are unordered, so if you need a sorted result, call sorted() instead of list().
pythonsorted_unique = sorted(set(visits)) print(sorted_unique) # ['about', 'contact', 'home']
Use set operations any time you need to answer questions like "what do these two groups have in common?" or "what's in one group but not the other?" — they express the intent clearly and run efficiently.
Constraints
- –Use the set intersection operator `&` to find shared elements.
- –Return a sorted `list`, not a set.
- –Each element must appear only once in the result even if it appears multiple times in the inputs.
02Reading and Writing Files
Learn to open, read, and write text files in Python — skills that let your programs load real data from disk, process it, and save results.
1.Opening and Reading Files
The open() function returns a file object that lets you read or write a file — it takes at minimum a filename and returns a handle you can call methods on.
The two most common reading methods are .read(), which returns the entire file as a single string, and .readlines(), which returns a list of lines (each line includes its trailing newline character \n).
pythonf = open('notes.txt') # opens for reading by default content = f.read() # entire file as one string f.close() # always close when done f2 = open('notes.txt') lines = f2.readlines() # list of strings, one per line f2.close()
Forget to call .close() and your program may hold the file open longer than necessary — or fail to flush written data.
The with statement solves this automatically. It is a context manager: it opens the file, runs the indented block, then closes the file for you — even if an error occurs inside the block.
pythonwith open('notes.txt') as f: content = f.read() # file is closed here automatically print(content)
If the file does not exist, open() raises a FileNotFoundError. You will learn to catch this in a later lesson — for now, make sure the filename is spelled correctly.
Use open() with with for all file reading — it is shorter, safer, and the standard Python idiom.
Constraints
- –The function receives a list of strings — do not use `open()` or any file I/O inside the function.
- –Use the `len()` built-in to count the lines.
- –Return an integer.
2.Processing Files Line by Line
Iterating over a file object line by line with for line in file is the most memory-efficient way to read a large file — Python loads one line at a time instead of the whole file at once.
Each line includes its trailing newline character \n. Call .strip() on every line to remove both leading and trailing whitespace (including that newline) before processing.
pythonlines = ['Alice,88\n', 'Bob,92\n', 'Carol,75\n'] for line in lines: clean = line.strip() # 'Alice,88' print(clean)
Once a line is stripped, you can split it into fields using .split(',') for comma-separated data or .split() (no argument) to split on any whitespace.
pythonfor line in lines: name, score = line.strip().split(',') print(f'{name} scored {score}')
Combining these techniques lets you parse structured text files into Python data structures. Here the in-memory list of strings acts as a stand-in for a real file object — the same logic works with for line in file inside a with open(...) block.
pythonrecords = [] for line in lines: parts = line.strip().split(',') records.append({'name': parts[0], 'score': int(parts[1])}) print(records) # [{'name': 'Alice', 'score': 88}, {'name': 'Bob', 'score': 92}, ...]
Use line-by-line iteration whenever you are processing a file with one record per line — it is the standard pattern for text-based data.
Constraints
- –Strip each line before splitting — remove trailing newlines and whitespace.
- –Return a list of dicts, each with exactly the keys `'name'` and `'score'`.
- –Convert the score to an integer using `int()`.
3.Writing to Files
Opening a file in 'w' mode creates the file if it does not exist, or overwrites it completely if it does. Opening in 'a' mode appends to the end of an existing file without erasing its content.
Use .write(string) to write a single string. Unlike print(), .write() does not add a newline automatically — you must include \n yourself.
pythonlines = ['Line 1', 'Line 2', 'Line 3'] with open('output.txt', 'w') as f: for line in lines: f.write(line + '\n')
.writelines(iterable) accepts a list of strings and writes each one in turn — still no automatic newlines, so include \n in each string or join them yourself.
pythonwith open('output.txt', 'w') as f: f.writelines([line + '\n' for line in lines])
To add to an existing file without overwriting, use 'a' mode:
pythonwith open('log.txt', 'a') as f: f.write('New entry\n')
You can verify what was written by immediately opening the file again in 'r' mode and reading it back. In a code challenge the executor cannot write to the filesystem — instead, practice the logic on in-memory strings.
Use 'w' when you want a fresh file each run, and 'a' when you want to accumulate entries over multiple runs.
Constraints
- –Return a single string — do not use print() or file I/O.
- –Each item must be followed by exactly one newline character `'\n'`.
- –Use str.join() or a loop — either approach is acceptable.
4.Real Data Patterns
Reading CSV-style data into a list of dicts is one of the most common file-processing tasks in Python — it gives each row a name and makes the data easy to query.
The pattern: skip the header line if present, then strip and split each remaining line.
pythoncsv_lines = ['name,score', 'Alice,88', 'Bob,92', 'Carol,75'] records = [] for line in csv_lines[1:]: # skip header name, score = line.split(',') records.append({'name': name, 'score': int(score)}) print(records) # [{'name': 'Alice', 'score': 88}, ...]
You can also build a dict from the data when you need fast lookup by key:
pythonscore_map = {} for line in csv_lines[1:]: name, score = line.split(',') score_map[name] = int(score) print(score_map['Alice']) # 88
When opening real files, you should always guard against FileNotFoundError using a try/except block. This prevents your program from crashing when a file is missing.
pythontry: with open('data.csv') as f: content = f.read() except FileNotFoundError: print('File not found — check the filename.') content = ''
The except clause catches the specific error and lets your program continue gracefully — returning an empty result, logging the problem, or asking the user for a valid path.
Use try/except around every open() call in production code — real files go missing, get renamed, or end up in the wrong directory all the time.
Constraints
- –Skip the first line — it is the header `'name,score'`.
- –Return a `dict` where each key is a name string and each value is an integer score.
- –Use `.split(',')` to parse each line — do not use the csv module.
Frequently Asked Questions
- What makes a tuple different from a list?
- Tuples are immutable — you cannot change their contents after creation. Immutability is the key property of tuples. Unlike lists, you cannot add, remove, or replace elements in a tuple after it is created — this makes tuples safe to use as dictionary keys and as return values you don't want accidentally modified.
- Which of the following correctly creates a tuple with a single element?
- (42,). A trailing comma inside parentheses tells Python this is a tuple, not just a grouped expression. Without the comma, (42) is evaluated as the integer 42. [42] is a list and {42} is a set.
- What does zip(['a', 'b'], [1, 2]) produce when converted to a list?
- [('a', 1), ('b', 2)]. zip() pairs elements from each iterable into tuples. Converting the zip object to a list gives a list of tuples: [('a', 1), ('b', 2)]. This is exactly what makes dict(zip(keys, values)) work — dict() accepts a list of (key, value) tuples.
- What does {'cat', 'dog', 'cat', 'bird'} evaluate to?
- {'cat', 'dog', 'bird'} — duplicates are removed. Sets automatically discard duplicate values. Even though 'cat' appears twice in the literal, the resulting set contains only one copy of each unique value: {'cat', 'dog', 'bird'}. This is the defining property of a set.
- What does set([1, 2, 3]) & set([2, 3, 4]) return?
- {2, 3}. The & operator computes the intersection — elements that appear in both sets. Both {1, 2, 3} and {2, 3, 4} contain 2 and 3, so the result is {2, 3}. The union would be {1, 2, 3, 4} and the difference would be {1} or {4} depending on the order.
- How do I write a Python function `describe_range(numbers)` that takes a list of numbers, and uses a tuple to return the minimum and maximum values as a two-element tuple `(min_val, max_val)`.?
- A tuple is an ordered, immutable sequence of values — like a list that cannot be changed after it is created.
- How do I write a Python function `pair_up(keys, values)` that takes two lists of equal length and returns a dictionary built by zipping them together, using each key-value pair.?
- Tuples show up constantly in Python's standard library — once you recognise the patterns, you'll use them everywhere.
- How do I write a Python function `unique_tags(tag_list)` that takes a list of strings (which may contain duplicates) and returns a sorted list of the unique tags.?
- A set is an unordered collection of unique values — no element can appear more than once, and there is no guaranteed order.
- How do I write a Python function `common_elements(list_a, list_b)` that takes two lists and returns a sorted list of elements that appear in both lists, with no duplicates.?
- Set operations let you compare and combine sets using simple operators — the same symbols used in mathematics.
- What is the main advantage of using `with open('file.txt') as f:` over `f = open('file.txt')`?
- It automatically closes the file when the block exits, even if an error occurs. The with statement is a context manager — it guarantees that the file is closed when the indented block exits, whether normally or due to an exception. Without it, you must remember to call f.close() manually, and a crash before that point would leave the file open.
- What does `.readlines()` return when called on an open file?
- A list of strings, one per line, each including the trailing newline. .readlines() returns a list where each element is one line from the file. Crucially, each string includes its trailing '\n' character — so you usually call .strip() on each line before processing it. .read() returns the whole file as one string.
- Which file mode would you use to add new content to the end of an existing file without erasing what's already there?
- 'a'. 'a' (append) mode opens the file and positions the write cursor at the end, so new content is added after the existing content. 'w' (write) mode overwrites the entire file from scratch. 'r' is read-only and raises an error if you try to write.
- Why must you include `'\n'` when calling `f.write('Hello')`?
- Unlike print(), f.write() does not add a newline automatically. f.write() writes exactly the string you give it — nothing more. If you want a newline at the end of a line, you must include '\n' in the string: f.write('Hello\n'). This is different from print() which appends a newline by default.
- What exception should you catch when a file might not exist?
- FileNotFoundError. FileNotFoundError is raised by open() when the specified file does not exist. It is the specific exception to catch when you are not sure a file will be present. IOError is a broader alias but FileNotFoundError is the precise, idiomatic choice for this case.
- How do I write a Python function `count_lines(lines)` that takes a list of strings (representing the lines of a file, as returned by `.readlines()`) and returns the number of lines in the list.?
- The `open()` function returns a file object that lets you read or write a file — it takes at minimum a filename and returns a handle you can call methods on.
- How do I write a Python function `parse_scores(lines)` that takes a list of strings in the format `'Name,Score\n'` and returns a list of dicts, each with keys `'name'` (string) and `'score'` (integer).?
- Iterating over a file object line by line with `for line in file` is the most memory-efficient way to read a large file — Python loads one line at a time instead of the whole file at once.
- How do I write a Python function `format_lines(items)` that takes a list of strings and returns a single string where each item is followed by a newline character `'\n'`, suitable for writing to a file with `.write()`.?
- Opening a file in `'w'` mode creates the file if it does not exist, or **overwrites** it completely if it does. Opening in `'a'` mode **appends** to the end of an existing file without erasing its content.
- How do I write a Python function `csv_to_dict(lines)` that takes a list of strings representing CSV lines (first line is a header `'name,score'`, rest are data rows) and returns a dict mapping each name to its integer score.?
- Reading CSV-style data into a list of dicts is one of the most common file-processing tasks in Python — it gives each row a name and makes the data easy to query.
Ready to write code?
Theory is just the start. Write real code, run tests, build the habit.
Open the playground →