With the lack of strong typing and compile-time checks, writing bug-free python code can be tricky. A common approach is just to rely on unit tests to exercise all the code and trigger any bugs, but python does have some static analysis and style checking tools available like pylint and pep8.

Furthermore, it is nice to have pep8 style requirements on code that is checked into a shared repository, to keep things consistent across different authors. So before committing some code I often found myself fussing with pep8 problems as much as I was testing and fixing functional issues. My process was generally:

  1. Run pylint -E and fix any errors
  2. Run unit tests, fix more errors
  3. (optional) Run pep8 and fix style errors
  4. (optional) Run pylint without -E, fix warnings, feel guilty about long and short variable names and functions with too many arguments
  5. Repeat until there are no more errors

There are some problems with this:

  1. pylint is noisy, there are lots of style complaints that you might not want to fix
  2. If you have a lot of test cases it can take a long time to get to the ones you care about
  3. pylint -E is also noisy when using frameworks like Django (the dreaded “Class X has no ‘objects’ member” error) because Django does a lot of metastuff that pylint cannot follow
  4. Why can’t stupid pep8 whitespace errors fix themselves!

The good news is that problem 4 is already fixed via autopep8, we can just run autopep8 -i on the files we want to fix, and that will fix all sorts of pep8 spacing problems.

Problem 3 is also solvable via the pylint-django package. Which sets up all the pylint special cases for the model meta magic.

Problem 1 is solved by customizing the pylint message classes you do not want to see using the -d flag, but who wants to do all that customization and looking up of pep8 classes? A better way for lazy people is to install the package prospector. Prospector actually doesn’t just simplify pylint output but includes some other nice packages like mccabe and pyflakes. It also aggregates results between these tools and eliminates some duplicates.

The next problem is 2, one could just always run all the unit tests when developing, but if you have 100s or 1000s of unit tests, it might be a few minutes before you get around to the point in your test suite where you are developing. One solution is just to selectively pick the package or module you are developing on the command line. This involves figuring out which files you are working on and picking the related tests and putting it all on the command line. Another solution is to use an IDE that auto-runs your tests, which presumably will run the tests for the files you are changing first. This problem should be automatible though, right? Python test files must start with test_, so generally the convention is to put all the tests for foo.py somewhere in a file called test_foo.py. So an efficient test runner solution would look like this:

  1. Get a list of files that you have modified from git, this is pretty easily scriptable with git diff --name-only. For example, you could get the changed files staged with git diff --name-only --cached.
  2. So now you have a list of files changed, you then pull out the changed files that are test files themselves and put them aside for a minute.
  3. Take the non test files and find their test file counterparts starting with test_
  4. Take the union of the synthesized test file set and the other changed test files.
  5. Specify this list of test files to the test runner.

For that matter, you can use the set of test files plus other modified files to select results from prospector and autopep8 as well. This is generally a good idea when dealing with a large legacy code base because you may not want to make a bunch of random pep8 whitespace changes to files that you are not actively working on.

Great, we are ready to put it all together now. First get the list of files:

pyfiles=$(git diff --cached --name-only | grep ".py$")

This grabs all the staged python files, we could instead use git diff HEAD^ to get the last commit instead if we like. Next let’s filter out the files that no longer exist (like if we just did git rm):

existingfiles=""
for f in $pyfiles; do
    if [ -e $f ]; then
        existingfiles=$(echo "$existingfiles $f")
    fi
done

Next, let’s generate a list of test files:

declare -A totest
for f in $existingfiles; do
    if [[ "$f" =~ "/test_" ]]; then
        totest[$f]=1
    else
        withtestpath=${f/\//\/tests\/}
        withtestprefix=$(echo $withtestpath | sed 's/\([a-zA-Z_]*\.py\)/test_\1/')
        if [ -e "$withtestprefix" ]; then
            totest[$withtestprefix]=1
        fi
    fi
done

In the above we make use of an associative array to store filenames to eliminate duplicates (if we are editing test_foo.py and foo.py). The if statement collects files that are already test files. The else statement constructs test files from given non-test files. $withtestpath injects “tests” after the first path element, reflecting the particular directory structure of this project. $withtestprefix prepends a “test_” to the name of the python file to get the fill test path.

Now we are in a position to start using our file lists:

autopep8 -i $existingfiles
prospector | grep -E -A2 $(echo "${existingfiles// /|}" | sed 's/^|//' | sed 's/|$//')

We just run autopep8 (in-place) on all the existing files we changed, and then run prospector. Since prospector runs over all the files in the project, there is an extra filter to grep out just the contexts that relate to the files we modified.

Finally, we convert our test files to packages and run them as tests:

nopy=${!totest[@]//.py/}
packages=${nopy//\//\.}
manage.py test $packages

Here we take the contents of the associative array, discard the .py suffixes then convert the /s to .s and feed it into the test tool (for Django in this case).