The linter is quiet on your module from yesterday. No naming violations, no bare excepts, no unused imports. Good. Now — can you read it?
I wrote it two days ago. I remember what most of it does. Why?
I mean can a new person on the project read it. They open the file and see a function called calculate_weighted_score. What do they know about it?
The name tells them it calculates a weighted score. The parameters are items and weights. That should be enough context.
Is weights a list of floats that must sum to 1.0, or integers that get normalized? Does items need to be the same length as weights? What does the function return? What happens if weights is empty? They have to read the implementation to answer any of those. A docstring closes that gap:
def calculate_weighted_score(items: list[float], weights: list[float]) -> float:
'''Return the weighted average of items scaled by weights.
Args:
items: Numeric values to weight. Must match the length of weights.
weights: Non-negative multipliers. Need not sum to 1.0 — normalized internally.
Returns:
A float between the minimum and maximum values in items.
Raises:
ValueError: If items and weights differ in length, or weights sums to zero.
'''Every question answered without opening the implementation.
The Args / Returns / Raises structure — is that a specific standard, or just a format you use?
Google style, documented in PEP 257. There is also NumPy style with dashes, and Sphinx style with :param: directives. They all do the same thing. Pick one per project. The structure matters more than which structure — tools can parse all three.
What does PEP 257 require mechanically? I have written one-line docstrings but I do not know the rules for multi-line ones.
Two layers. Mechanical: triple double-quotes, closing """ on its own line for multi-line docstrings. Semantic: the first line is an imperative sentence describing what the function does, not what it is:
# Wrong — describes what it is
"""A function that parses configuration files."""
# Correct — says what it does
"""Parse a configuration file and return its contents as a dict."""The imperative mood matches how Python built-ins are documented: list.append says "Append object to the end of the list," not "A method that appends."
I have a private function called _process. In a file called processor.py. It processes things. Three layers of the same information.
That function has achieved a rare silence. It says nothing three different ways. The docstring is where it should finally say something useful. And docstring coverage is measurable — you can parse your own source code with the ast module and calculate what fraction of your functions are documented. That is today's challenge.
You can measure documentation coverage the same way you measure test coverage?
ast.parse gives you the syntax tree without executing the code. ast.walk visits every node. ast.get_docstring extracts the cleaned text. Walk the tree, check each function for a docstring, calculate the percentage. Add it to CI — if coverage drops below 80%, the build fails. The tool cannot tell you whether your docstrings are good. It can tell you whether they exist. That is the first gate.
Documentation as a metric, enforced like test coverage. A new engineer opening the codebase can know before running anything that at least 80% of functions have something to read. The quality is still a human judgment, but the existence is automatic.
Exactly the frame. Tomorrow mypy checks whether the types you write in those signatures are actually honored at every call site. You write the contract today — the docstring and the annotations. Tomorrow it gets verified.
Python docstrings are not comments. A comment (#) is ignored by the interpreter entirely. A docstring is a string literal that Python assigns to the __doc__ attribute of the function, class, or module. That means docstrings are inspectable at runtime: help(my_function) prints the docstring, my_function.__doc__ returns it as a string, and documentation generators like Sphinx and pdoc extract it automatically.
The ast.get_docstring mechanism. ast.get_docstring(node) checks whether the first statement in a function, class, or module body is an ast.Expr node containing a string literal. If it is, the function returns the string value after stripping leading and trailing whitespace from each line. This handles all quoting styles — single, double, triple — and normalizes indentation. The raw ast.Constant value is not returned; the cleaned text is.
Module docstrings and the PEP 257 ordering rule. PEP 257 specifies that a module docstring, if present, must be the first statement in the file — before imports. This is not enforced by Python but is expected by documentation generators and pydoc. When pydoc generates documentation for a module, it reads the __doc__ attribute, which Python populates from the first string literal in the module body. If imports come first, the __doc__ attribute is None and the generated documentation has no description.
Google style vs. NumPy style vs. Sphinx style. All three formats are supported by Sphinx via the Napoleon extension. Google style uses indented plain text under section headers (Args:, Returns:, Raises:). NumPy style uses dashed underlines under section names. Sphinx style uses RST directives (:param name: description). The practical difference: Google style is most readable in source code without rendering; NumPy style is standard in scientific Python; Sphinx style is the only one Sphinx processes natively without Napoleon. For a web application team, Google style is the clearest choice.
The imperative mood rule. PEP 257 specifies that the first line of a docstring must be a phrase in the imperative mood — "Return the result," not "Returns the result" or "This function returns the result." The rationale is consistency with Python's own documentation. The interactive help system displays docstrings in contexts where the imperative reads more naturally: help(list.sort) shows "Sort the list in place, and return None," which reads as an instruction, not a description.
Sign up to write and run code in this lesson.
The linter is quiet on your module from yesterday. No naming violations, no bare excepts, no unused imports. Good. Now — can you read it?
I wrote it two days ago. I remember what most of it does. Why?
I mean can a new person on the project read it. They open the file and see a function called calculate_weighted_score. What do they know about it?
The name tells them it calculates a weighted score. The parameters are items and weights. That should be enough context.
Is weights a list of floats that must sum to 1.0, or integers that get normalized? Does items need to be the same length as weights? What does the function return? What happens if weights is empty? They have to read the implementation to answer any of those. A docstring closes that gap:
def calculate_weighted_score(items: list[float], weights: list[float]) -> float:
'''Return the weighted average of items scaled by weights.
Args:
items: Numeric values to weight. Must match the length of weights.
weights: Non-negative multipliers. Need not sum to 1.0 — normalized internally.
Returns:
A float between the minimum and maximum values in items.
Raises:
ValueError: If items and weights differ in length, or weights sums to zero.
'''Every question answered without opening the implementation.
The Args / Returns / Raises structure — is that a specific standard, or just a format you use?
Google style, documented in PEP 257. There is also NumPy style with dashes, and Sphinx style with :param: directives. They all do the same thing. Pick one per project. The structure matters more than which structure — tools can parse all three.
What does PEP 257 require mechanically? I have written one-line docstrings but I do not know the rules for multi-line ones.
Two layers. Mechanical: triple double-quotes, closing """ on its own line for multi-line docstrings. Semantic: the first line is an imperative sentence describing what the function does, not what it is:
# Wrong — describes what it is
"""A function that parses configuration files."""
# Correct — says what it does
"""Parse a configuration file and return its contents as a dict."""The imperative mood matches how Python built-ins are documented: list.append says "Append object to the end of the list," not "A method that appends."
I have a private function called _process. In a file called processor.py. It processes things. Three layers of the same information.
That function has achieved a rare silence. It says nothing three different ways. The docstring is where it should finally say something useful. And docstring coverage is measurable — you can parse your own source code with the ast module and calculate what fraction of your functions are documented. That is today's challenge.
You can measure documentation coverage the same way you measure test coverage?
ast.parse gives you the syntax tree without executing the code. ast.walk visits every node. ast.get_docstring extracts the cleaned text. Walk the tree, check each function for a docstring, calculate the percentage. Add it to CI — if coverage drops below 80%, the build fails. The tool cannot tell you whether your docstrings are good. It can tell you whether they exist. That is the first gate.
Documentation as a metric, enforced like test coverage. A new engineer opening the codebase can know before running anything that at least 80% of functions have something to read. The quality is still a human judgment, but the existence is automatic.
Exactly the frame. Tomorrow mypy checks whether the types you write in those signatures are actually honored at every call site. You write the contract today — the docstring and the annotations. Tomorrow it gets verified.
Python docstrings are not comments. A comment (#) is ignored by the interpreter entirely. A docstring is a string literal that Python assigns to the __doc__ attribute of the function, class, or module. That means docstrings are inspectable at runtime: help(my_function) prints the docstring, my_function.__doc__ returns it as a string, and documentation generators like Sphinx and pdoc extract it automatically.
The ast.get_docstring mechanism. ast.get_docstring(node) checks whether the first statement in a function, class, or module body is an ast.Expr node containing a string literal. If it is, the function returns the string value after stripping leading and trailing whitespace from each line. This handles all quoting styles — single, double, triple — and normalizes indentation. The raw ast.Constant value is not returned; the cleaned text is.
Module docstrings and the PEP 257 ordering rule. PEP 257 specifies that a module docstring, if present, must be the first statement in the file — before imports. This is not enforced by Python but is expected by documentation generators and pydoc. When pydoc generates documentation for a module, it reads the __doc__ attribute, which Python populates from the first string literal in the module body. If imports come first, the __doc__ attribute is None and the generated documentation has no description.
Google style vs. NumPy style vs. Sphinx style. All three formats are supported by Sphinx via the Napoleon extension. Google style uses indented plain text under section headers (Args:, Returns:, Raises:). NumPy style uses dashed underlines under section names. Sphinx style uses RST directives (:param name: description). The practical difference: Google style is most readable in source code without rendering; NumPy style is standard in scientific Python; Sphinx style is the only one Sphinx processes natively without Napoleon. For a web application team, Google style is the clearest choice.
The imperative mood rule. PEP 257 specifies that the first line of a docstring must be a phrase in the imperative mood — "Return the result," not "Returns the result" or "This function returns the result." The rationale is consistency with Python's own documentation. The interactive help system displays docstrings in contexts where the imperative reads more naturally: help(list.sort) shows "Sort the list in place, and return None," which reads as an instruction, not a description.