I ate so mnay salads.
This document gives coding conventions for the Python code comprising the standard library in the main Python distribution.
Many projects have their own coding style guidelines. In the event of any conflicts, such project-specific guides take precedence for that project.
I’m a bit surprised that they don’t strongly recommend PEP8 for everyday programming as it just says they are used in the standard library, while I feel most Python developers mindlessly follow them.
However, I think it’s worth following them to keep code clean and readable because there should be reasons Guido and others created this style guide.
# Correct:
# Add 4 spaces (an extra level of indentation) to distinguish arguments from the rest.
def long_function_name(
var_one, var_two, var_three,
var_four):
print(var_one)
# Wrong:
# Further indentation required as indentation is not distinguishable.
def long_function_name(
var_one, var_two, var_three,
var_four):
print(var_one)
I didn’t know about this formatting. I usually use
black
, and with it, the definition becomes usually
def long_function_name(
var_one, var_two, var_three, var_four
):
print(var_one)
or
def long_function_name(
var_one,
var_two,
var_three,
var_four,
):
I’m so used to it that I feel more comfortable with the
black
’s style. I only see the black
’s pattern,
so I’m not very sure which is more common nowadays.
Limit all lines to a maximum of 79 characters.
The original version of PEP8 was created in 2001, and I think this suggestion is obsolete as people nowadays use wide windows that are not limited to 80 characters width that has a historical reason.
black
set 88, and I see many projects use 120 as the
maximum line length.
Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters.
:thinking_face: From my (limited) experience, modern Python code don’t strictly follow the guidance.
The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces.
I sometimes see the overuse of backslashes that don’t look great. I might want to cite this in the future.
# Wrong:
# operators sit far away from their operands
income = (gross_wages +
taxable_interest +
(dividends - qualified_dividends) -
ira_deduction -
student_loan_interest)
# Correct:
# easy to match operators with operands
income = (gross_wages
+ taxable_interest
+ (dividends - qualified_dividends)
- ira_deduction
- student_loan_interest)
In Python code, it is permissible to break before or after a binary operator, as long as the convention is consistent locally. For new code Knuth’s style (*mentioned as the correct one aove) is suggested.
# Correct:
import os
import sys
# Wrong:
import sys, os
I didn’t know that I can do import lib1, lib2, lib3, ...
as this is the first time I see this pattern. I may be too young to know
the historical development of the Python community.
- Imports should be grouped in the following order: > 1. Standard library imports. > 2. Related third party imports. > 3. Local application/library specific imports. > You should put a blank line between each group of imports.
- Absolute imports are recommended…However, explicit relative imports are an acceptable alternative to absolute imports…Standard library code should avoid complex package layouts and always use absolute imports.
- Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.
This section reminds me of the use of __all__
. It
slipped out of my mind.
- Between a trailing comma and a following close parentheses:
# Correct
foo = (0,)
# Wrong
bar = (0, )
flake8
and ruff
does not raise errors for
it, but black
fixes the wrong style. :thinking_face:
- If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgment;
# Correct:
hypot2 = x*x + y*y
# Wrong:
hypot2 = x * x + y * y
I was not aware of this pattern. I think I should remember.
- Don’t use spaces around the = sign when used to indicate a keyword argument, or when used to indicate a default value for an unannotated function parameter:…When combining an argument annotation with a default value, however, do use spaces around the = sign:
# Correct:
def munge(sep: AnyStr = None): ...
# Wrong:
def munge(input: AnyStr=None): ...
Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don’t speak your language.
In addition, the following special forms using leading or trailing underscores are recognized (these can generally be combined with any case convention):
- _single_leading_underscore: weak “internal use” indicator. E.g. from M import * does not import objects whose names start with an underscore.
- single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g. :
tkinter.Toplevel(master, class_='ClassName')
My mentor pointed out this on my pull request.
Never use the characters ‘l’ (lowercase letter el), ‘O’ (uppercase letter oh), or ‘I’ (uppercase letter eye) as single character variable names.
In some fonts, these characters are indistinguishable from the numerals one and zero. When tempted to use ‘l’, use ‘L’ instead.
Modules that are designed for use via
from M import *
should use the__all__
mechanism to prevent exporting globals…
If a function argument’s name clashes with a reserved keyword, it is generally better to append a single trailing underscore rather than use an abbreviation or spelling corruption. Thus class_ is better than clss. (Perhaps better is to avoid such clashes by using a synonym.)
To avoid name clashes with subclasses, use two leading underscores to invoke Python’s name mangling rules. …Generally, double leading underscores should be used only to avoid name conflicts with attributes in classes designed to be subclassed.
Note: there is some controversy about the use of __names (see below).
Hmm… interesting
…Note 1: Try to keep the functional behavior side-effect free, although side-effects such as caching are generally fine…Note 2: Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.
If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores. This invokes Python’s name mangling algorithm, where the name of the class is mangled into the attribute name. This helps avoid attribute name collisions should subclasses inadvertently contain attributes with the same name.
I remember I had some discussion on these points.
Any backwards compatibility guarantees apply only to public interfaces. Accordingly, it is important that users be able to clearly distinguish between public and internal interfaces.
To better support introspection, modules should explicitly declare the names in their public API using the
__all__
attribute. Setting__all__
to an empty list indicates that the module has no public API.
- Code should be written in a way that does not disadvantage other implementations of Python (PyPy, Jython, IronPython, Cython, Psyco, and such).
I didn’t know many of the implementations.
…For example, do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form
a += b
ora = a + b
. This optimization is fragile even in CPython (it only works for some types) and isn’t present at all in implementations that don’t use refcounting. In performance sensitive parts of the library, the''.join()
form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.
I see…
…Also, beware of writing
if x
when you really meanif x is not None
– e.g. when testing whether a variable or argument that defaults to None was set to some other value. The other value might have a type (such as a container) that could be false in a boolean context!
I encountered this pitfall for some times in my life.
- Derive exceptions from Exception rather than BaseException. Direct inheritance from BaseException is reserved for exceptions where catching them is almost always the wrong thing to do.
- When catching exceptions, mention specific exceptions whenever possible instead of using a bare
except:
clause: …A bareexcept
: clause will catch SystemExit and KeyboardInterrupt exceptions, making it harder to interrupt a program with Control-C, and can disguise other problems. If you want to catch all exceptions that signal program errors, useexcept Exception
: (bare except is equivalent toexcept BaseException:
).
I didn’t know that a bare except:
will catch SystemExit
and KeyboardInterrupt.
- Additionally, for all try/except clauses, limit the try clause to the absolute minimum amount of code necessary. Again, this avoids masking bugs:
I always adovocate this, but it seems beginner Python developers tend to have a very broad try-except clause.
- Use
''.startswith()
and''.endswith()
instead of string slicing to check for prefixes or suffixes. startswith() and endswith() are cleaner and less error prone:
- Object type comparisons should always use isinstance() instead of comparing types directly:
- For sequences, (strings, lists, tuples), use the fact that empty sequences are false:
I remember I got a review comment about these guides.
Tada! :tada: I finished reading through PEP8. It was worth it to re-visit style guides I was about to forget.
TODO: