Photo by Jeff Sheldon on Unsplash
Raise the Bar of Code Quality in Python Projects
You can also read it on Medium.
Assume that you get accepted in one of your dream jobs. You think that you will work with the most talented colleagues and you feel very excited. After the orientation days, you finally sit on your desk and check the project you will be working on.
After spending some time, you realize that everything looks more complicated than it should because there are no styling checks in the project. Many developers contribute, and all of them have a different style of coding as usual. It would be hard to comprehend the project, right? It would be pretty simple if all parts of the project follow the same styling. Or it would be pretty simple if imports were in order and there were neither unused imports nor unused variables.
Yet, the company doesn’t want to force the developers to consider styling in each commit. What could they do? They could use a pre-commit hook and a CI pipeline! We will define a pre-commit hook, and add a Github Action as CI to a basic Python project in this article.
I will not explain how pre-commit hook or GitHub Actions work in details. Yet, it is always good to recall simple definitions.
scripts that run automatically every time a particular event occurs in a Git repository. They let you customize Git’s internal behavior and trigger customizable actions at key points in the development life cycle.
Git hooks are necessary because:
Git hook scripts are useful for identifying simple issues before submission to code review. We run our hooks on every commit to automatically point out issues in code such as missing semicolons, trailing whitespace, and debug statements. By pointing these issues out before code review, this allows a code reviewer to focus on the architecture of a change while not wasting time with trivial style nitpicks.
a multi-language package manager for pre-commit hooks. You specify a list of hooks you want and pre-commit manages the installation and execution of any hook written in any language before every commit. pre-commit is specifically designed to not require root access. If one of your developers doesn’t have node installed but modifies a JavaScript file, pre-commit automatically handles downloading and building node to run eslint without root.
You might check these links for understanding Git hooks deeply: 1, 2, 3
I have created a new repository for this article. You might check it here. Before adding any configuration, please take a look at the module and src
directory with two packages. They all have styling errors if you look at PEP 8 (E302, E303, and more):
src/bar/first_module.py
src/foo/first_module.py
main.py
We will start with a file named .pre-commit-config.yaml
in root of the project for configuring pre-commit hook:
$ touch .pre-commit-config.yaml
$ nano .pre-commit-config.yaml
**
When we glance at this config file:
.git
and.tox
files excluded usingexclude
keyword. The pre-commit hook will not try to fix these files. It is an optional field.- The pre-commit hook will run in each time when you try to commit. We used
default_stages: [commit]
keyword and value for that. It is also an optional field. Other possible values are:commit
,merge-commit
,push
,prepare-commit-msg
,commit-msg
,post-checkout
,post-commit
, ormanual
. - If pre-commit hook fails, the rest of the steps will not run. We set
fast_fail
totrue
for this behaviour. It is an optional field, as well. It is an optional field, default is false, as well.
These were the top-level configurations. Other fields you can use are default_language_version
, files
, minimum_pre_commit_version
, and repos
. We will use only repos
in our config file. We added a few repositories, which are going to run step by step:
- The first one is pre-commit-hooks. We are using it for:
trailing-whitespace
(trims whitespaces from the end of lines),end-of-file-fixer
(checks if all files end with a newline),check-toml
(checks the syntax of toml files),check-merge-conflict
(checks if your changes will cause a merge conflict or not). - The second one is black, i.e. The uncompromising Python code formatter. Black fixes almost all of the styling errors automatically.
- The third one is isort, a library that sorts your imports by type and name automatically. It separates Python’s built-in module/package imports, third-party imports, and project module/package imports.
- The final one is flake8. It evaluates your code using PEP-8 standard, pyflakes and other libraries. Even though black and isort solve most of the problems, you may still have some errors such as long strings.
As you can see, we included a few libraries which might cause conflicts unless you configure them. For example, when you use default configurations, black and isort handle the long line imports differently.
Before installing and running the pre-commit hook, we will create configuration files for these libraries, and they will be compatible. Again, we will create all the new files at the root of the project.
Run the followings to create a config file named pyproject.toml
for black and isort:
$ touch pyproject.toml
$ nano pyproject.toml
**
I find the default maximum line length, 88 for black and 79 for flake8, too small. Therefore I usually set the line-length parameter to 100. When lines are too long, it becomes difficult to see all codes in once because you should scroll to the right. However, if you use GitHub, Gitlab, or Bitbucket, you will still see the full-width line without scrolling.
For isort, we set black
profile and multi_line_output
(it means Vertical Hanging Indent) parameters. You can find more information for black compatibility of isort in this link.
Note: Preferably, You can create a new file named .isort.cfg
for isort configuration. You should use[settings]
instead of [tool.isort]
. Please check the code below:
$ touch .isort.cfg
$ nano .isort.cfg
**
Let’s create a config file named .flake8
for flake8:
$ touch .flake8
$ nano .flake8
**
As you see, I gave the same value for max-line-length
, and I excluded some files which might flake8 be bothered unnecessarily. If you use a virtual environment, you will probably have a venv
folder in your project, where you install your dependent libraries. I can assure you that you wouldn’t want flake8 to check your third-party libraries because it may take a long time and flake8 may find tons of warnings and errors. Please, never change the codes of the third-party libraries you use. If you spot an error, open an issue and PR in the repository. If you need more changes, fork the repo, modify and use it. The reason is that not all of the developers will have the environment you have. So, one thing that works for you won’t work for others or in production.
Finally, we will add all these config files to Git
.
$ git add .pre-commit-config.yaml pyproject.toml .flake8
Okay, we are ready to go. I will install pre-commit
via pip:
(venv) $ pip install pre-commit
(venv) $ pre-commit install
Since we installed the pre-commit hook, it will run in every commit. Let’s try to commit the added config files and see whether our hook runs or not.
$ git commit -m "Add configurations"
It should give the following error:
Our configuration files didn’t have a newline at the end
Okay, try to commit again:
$ git add .flake8 pyproject.toml
$ git commit -m "Add configurations"
The expected output is as follows:
We committed this time without a problem
It worked this time. Pre-commit hook skipped black, isort and flake8 steps because we didn’t add any python modules to our commit. If we added it, it would run these steps, too.
I had committed files with errors before I installed the pre-commit hook. What I want is running pre-commit hook and fixing these errors. We can use pre-commit
’s run command:
$ pre-commit run -a
It will run pre-commit hook for all files (that’s what -a
stands for). If you use the same files, you will need to run this command three times to:
- Add new empty lines to the modules,
- Reformat code using black
- Reformat imports using isort.
After that, you should see an error because of flake8:
Remove the not used import line from main.py:
import random # An import which we will not use at all
Then, you can run one more time to be sure:
All steps passed. Files are ready to commit
Run the commands below for committing the files and finishing the first part of the tutorial:
$ git add main.py src/bar/first_module.py src/foo/first_module.py
$ git commit -m "Run pre-commit hooks"
Let’s check final versions of the files:
src/bar/first_module.py
src/foo/first_module.py
main.py
They look more elegant, right? Here are the fixed warnings:
- PEP 8: E302 expected 2 blank lines between definition of a class or function
- PEP 8: E303 too many blank lines (2) between definition of methods
- PEP 8: E501 line too long (137 > 110 characters)
- PEP 8: W292 no newline at end of file
In the second part of the tutorial, we will use the pre-commit hook we created in GitHub Actions. Its definition by Githubis as follows:
GitHub Actions makes it easy to automate all your software workflows, now with world-class CI/CD. Build, test, and deploy your code right from GitHub. Make code reviews, branch management, and issue triaging work the way you want.
GitHub Actions offer a lot! It is a neat and robust tool. And it has comprehensive documentation. I will use the template it suggests:
**
I added this file under .github/workflows
directory. The last two steps are different than the example template:
- After upgrading the pip, we will install the pre-commit hook via pip.
- Pre-commit will be initialized and run for all files in the repository, but the excluded ones.
Since three Python versions specified, there will be three different builds for each push
to the remote repository. Here how it looks like:
All steps of the pre-commit hook are passed on GitHub as expected.
As a result, each time you commit and then push to the remote, GitHub Workflow will check the styling. Additionally, each time you commit locally, the pre-commit hook will run first. And it will prevent you from committing unless you use force push
.
You have now the setup which will help you to standardize styling of your project. Congratulations!
You can find the latest version of the repository below:
BarisSari/medium-pre-commit-article
Conclusion
It is dead simple to use the pre-commit hook in both locally and on GitHub Actions. And you don’t even need to change your code manually in most cases as the pre-commit will fix them automatically. Your hook will fix them automatically. However, when you have some insisting errors, you should check the pre-commit logs and make changes.
Please keep in mind that having a pre-commit hook does not mean that your project is compatible with all PEP rules. For example, as to PEP-8, you should use only lowercase and underscore when you name your function. The pre-commit hook will not complain if you define a function called TEST. You should be aware of what the included libraries(black, isort, flake8) are capable of and what they are not. Still, If you are trying to follow Python’s best practices, pre-commit hooks will surely help you.
Note: Please keep in mind that I gave silly names for modules and packages. I tried to show how simple and effective pre-commit hooks are. That’s why I tried to make modules and packages as silly as I could. You can check this gist for some of the best practices of Python such as naming conventions.
Note-2: You may use the pre-commit hook for other programming languages as well. Please check this link to see all supported hooks.
Note-3: I’m using PyCharm, and it does run the pre-commit hook even I use its commit & push screen. You can check if your IDE has this functionality unless you use terminal for Git.
Note-4: There are other great CI/CD tools such as Travis, CircleCI, Jenkins. Once you setup and configure a pre-commit hook for your project, you can always include it to your favourite CI/CD tool by adding a few lines. There are many examples on the internet. You can check them out.
Thank you for reading up to the end! I’m looking forward to hearing your responses.