20250329

I ate so mnay salads.

Browsing PEP8 — Style Guide for Python Code

This document gives coding conventions for the Python code comprising the standard library in the main Python distribution.

Many projects have their own coding style guidelines. In the event of any conflicts, such project-specific guides take precedence for that project.

I’m a bit surprised that they don’t strongly recommend PEP8 for everyday programming as it just says they are used in the standard library, while I feel most Python developers mindlessly follow them.

However, I think it’s worth following them to keep code clean and readable because there should be reasons Guido and others created this style guide.

Indentation

# Correct:

# Add 4 spaces (an extra level of indentation) to distinguish arguments from the rest.
def long_function_name(
        var_one, var_two, var_three,
        var_four):
    print(var_one)

# Wrong:

# Further indentation required as indentation is not distinguishable.
def long_function_name(
    var_one, var_two, var_three,
    var_four):
    print(var_one)

I didn’t know about this formatting. I usually use black, and with it, the definition becomes usually

def long_function_name(
   var_one, var_two, var_three, var_four
):
   print(var_one)

or

def long_function_name(
    var_one,
    var_two,
    var_three,
    var_four,
):

I’m so used to it that I feel more comfortable with the black’s style. I only see the black’s pattern, so I’m not very sure which is more common nowadays.

Maximum Line Length

Limit all lines to a maximum of 79 characters.

The original version of PEP8 was created in 2001, and I think this suggestion is obsolete as people nowadays use wide windows that are not limited to 80 characters width that has a historical reason.

black set 88, and I see many projects use 120 as the maximum line length.

Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters.

:thinking_face: From my (limited) experience, modern Python code don’t strictly follow the guidance.

The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces.

I sometimes see the overuse of backslashes that don’t look great. I might want to cite this in the future.

Should a Line Break Before or After a Binary Operator?

# Wrong:
# operators sit far away from their operands
income = (gross_wages +
          taxable_interest +
          (dividends - qualified_dividends) -
          ira_deduction -
          student_loan_interest)

# Correct:
# easy to match operators with operands
income = (gross_wages
          + taxable_interest
          + (dividends - qualified_dividends)
          - ira_deduction
          - student_loan_interest)

In Python code, it is permissible to break before or after a binary operator, as long as the convention is consistent locally. For new code Knuth’s style (*mentioned as the correct one aove) is suggested.

Imports

# Correct:
import os
import sys

# Wrong:
import sys, os

I didn’t know that I can do import lib1, lib2, lib3, ... as this is the first time I see this pattern. I may be too young to know the historical development of the Python community.

Module Level Dunder Names

This section reminds me of the use of __all__. It slipped out of my mind.

Whitespace in Expressions and Statements

Pet Peeves

# Correct
foo = (0,)

# Wrong
bar = (0, )

flake8 and ruff does not raise errors for it, but black fixes the wrong style. :thinking_face:

Other Recommendations

# Correct:
hypot2 = x*x + y*y

# Wrong:
hypot2 = x * x + y * y

I was not aware of this pattern. I think I should remember.

# Correct:
def munge(sep: AnyStr = None): ...

# Wrong:
def munge(input: AnyStr=None): ...

Comments

Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don’t speak your language.

Naming Conventions

Descriptive: Naming Styles

In addition, the following special forms using leading or trailing underscores are recognized (these can generally be combined with any case convention):

My mentor pointed out this on my pull request.

Priscriptive: Naming Conventions

Names to Avoid

Never use the characters ‘l’ (lowercase letter el), ‘O’ (uppercase letter oh), or ‘I’ (uppercase letter eye) as single character variable names.

In some fonts, these characters are indistinguishable from the numerals one and zero. When tempted to use ‘l’, use ‘L’ instead.

Global Variable Names

Modules that are designed for use via from M import * should use the __all__ mechanism to prevent exporting globals…

Function and Method Arguments

If a function argument’s name clashes with a reserved keyword, it is generally better to append a single trailing underscore rather than use an abbreviation or spelling corruption. Thus class_ is better than clss. (Perhaps better is to avoid such clashes by using a synonym.)

Method Names and Instance Variables

To avoid name clashes with subclasses, use two leading underscores to invoke Python’s name mangling rules. …Generally, double leading underscores should be used only to avoid name conflicts with attributes in classes designed to be subclassed.

Note: there is some controversy about the use of __names (see below).

Hmm… interesting

Designing for Inheritance

…Note 1: Try to keep the functional behavior side-effect free, although side-effects such as caching are generally fine…Note 2: Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.

If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores. This invokes Python’s name mangling algorithm, where the name of the class is mangled into the attribute name. This helps avoid attribute name collisions should subclasses inadvertently contain attributes with the same name.

I remember I had some discussion on these points.

Public and Internal Interfaces

Any backwards compatibility guarantees apply only to public interfaces. Accordingly, it is important that users be able to clearly distinguish between public and internal interfaces.

To better support introspection, modules should explicitly declare the names in their public API using the __all__ attribute. Setting __all__ to an empty list indicates that the module has no public API.

Programming Recommendations

I didn’t know many of the implementations.

…For example, do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b. This optimization is fragile even in CPython (it only works for some types) and isn’t present at all in implementations that don’t use refcounting. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.

I see…

…Also, beware of writing if x when you really mean if x is not None – e.g. when testing whether a variable or argument that defaults to None was set to some other value. The other value might have a type (such as a container) that could be false in a boolean context!

I encountered this pitfall for some times in my life.

I didn’t know that a bare except: will catch SystemExit and KeyboardInterrupt.

I always adovocate this, but it seems beginner Python developers tend to have a very broad try-except clause.

I remember I got a review comment about these guides.

Tada! :tada: I finished reading through PEP8. It was worth it to re-visit style guides I was about to forget.


TODO:


index 20250328 20250330