In the last year I’ve been working regularly with Python, porting a UI library to Dash and supporting people making apps with it.
It’s the first time I’ve used Python for work; and having forgotten what little Python I’d ever known, I’ve basically been learning from scratch.
I’ve found Python-the-language pretty approachable, and the web is full of tutorials for it. Python-the-ecosystem hasn’t been as approachable though, and there seems to be less written about the practical plumbing of Python projects.
When I asked around, everyone from experienced Python devs to other recent learners said they’d found the same thing – lots of language intros, not much on the tooling. So I decided to write the blog post I was looking for at the start.
To be clear this is not presented as expert Python advice. It’s a survival guide from a Python novice, intended for even-more-recent Python arrivals. I’m sure some of it’s not optimal (ie. wrong), but overall hopefully it’s good enough to be useful. Also this doesn’t get into the language much, although I do give some links if you need them.
TL;DR
This is the short, short version:
NodeJS thing | Python thing | Notes |
---|---|---|
NodeJS LTS | Python 3.x | Python doesn’t run a specific LTS stream, instead PEP-0602 notes the five year support period for 3.x releases makes them all effectivly LTS. While PEP-0602 refers to 'major versions’ they mean point releases under Python 3. |
NVM | Pyenv | Both manage versions of the runtime and allow you to commit a version dotfile. |
NPM | PIP & PyPi | Standard/default package manager bundled with the runtime installer. In both ecosystems people tend to refer to the installer and registry interchangeably. |
Global & local install | venv |
PIP and NPM default to opposite behaviours. NPM installs locally by default, globally when specified with npm i -g . PIP is global by default, and local installation is handled with virtual environments (usually created with venv ). |
Yarn, Rome | Poetry, Pipenv, Conda | These are only very loosely comparable: non-standard tools that do basically the same thing as the standard tools but in a different way; and they add functionality beyond the standard tools. Poetry is probably the most “nodejs-ish”. Conda is the least like any other option, particularly as it is a registry for other langauges as well. |
JavaDoc | Docstrings | Inline documentation that enables intellisense/autocomplete/IDE hints. |
Prettier | Black | Both automagically reformat your code to stop linters bugging you about formatting. |
ESLint | Pylint | Both check for errors and formatting. |
node_modules |
env , pycache |
Stuff you’ll want to gitignore. |
If that’s all you need, good night and good luck! If you want more detail, read on.
Python origin and governance
Programming languages and their communities are shaped by their creators and history, so I’ll start there.
Python was started in 1989 by Dutch programmer Guido van Rossum. The story is that it was a holiday project, and he named it Python because he was a fan of Monty Python. This is why you will see a lot of spam references in Python docs and tutorials, but not as many reptile references as you might expect. It’s also why quirks of Python are occasionally explained with a joke that it might make more sense if you’re Dutch.
Early versions of Python were used internally at Van Rossum’s workplace (CWI) in 1990, with the first public release happening in 1991 and v1.0.0 in 1994.
Van Rossum remained central to Python’s direction for nearly thirty years, declared by the community to be “Benevolent Dictator For Life”. There was also a group of core developers and from 2000 onwards a mechanism known as “PEPs” for handling ideas for language changes.
The nonprofit Python Software Foundation was launched in 2001. It holds the IP of Python, and handles some community matters like a conference and Python grants program.
In 2018 Van Rossum took a “permanent vacation” from BDFL, saying it was time for a “transfer of power”. A steering council was set up instead, and Rossum stepped down entirely in 2020.
There were some fears the leadership change could hurt Python’s popularity, but they seem to have been unfounded as it reached second place on the TIOBE index in May 2021.
(If you want a detailed history, check out Van Rossum’s history of Python)
PEPs
You’ll often see people refer to PEP numbers without further explanation, so it’s good to know what they are and where to find them. Appropriately, PEP-1 defines PEPs…
PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python or its processes or environment.
It’s a deceptively simple sounding purpose, as it means much more than language features can be addressed in a PEP. A few examples to illustrate the range that opens up:
- PEP 8 is the style guide
- PEP 20 is the Zen of Python
- PEP 257 sets out docstring conventions
- PEP 440 defines version specifiers
There is a categorised index of PEPs at python.org.
Python 2 vs 3
Since Python 2 no longer receives security updates, there’s no question you should be using Python 3 now. But it’s not as simple as you might expect…
Python 3 came out in 2008 and Python 2 was due for end-of-life in 2015. For a variety of reasons Python 2’s EOL was pushed back to 2020… twelve years after Python 3 came out.
The unusually long crossover period leaves an odd legacy: it’s common to find both 2 and 3 installed, but also common to find just 2 or 3; and when both are present it’s not guaranteed that 3 will be the system default. The recency of the Python 2’s EOL means some linux distros are just starting to ship without Python 2 installed.
What that means is builds that worked for years, quietly calling Python 2, have started breaking after OS updates or upgrading to new base images. Remember it’s not just your own code, it’s every single dependency in the tree; including nodejs dependencies that you may not even realise were running gyp
during installation. It only takes one old dependency calling Python 2 to break your build… take a guess how I know that.
So in your CLI of choice you should start by running…
python --version
python2 --version
python3 --version
...to see what you’re dealing with. Then at least you know which versions are available, and which one is set as default.
To get a further handle on runtimes, also run…
which python
which python2
which python3
...to see which versions are system installs, and which (if any) are shimmed.
Finally, call the point version. For example if your python3
is Python 3.9.1, run python3.9 --version
. You may never need to do this again, but it’s good to know it’s possible.
To manage all this Fun™, what I’ve personally been doing so far:
- use the explicit
python3
andpip3
commands in scripts that run in multiple environments (including uncontrolled environments like other peoples’ workstations) - use
Pyenv
to manage the runtime version in projects. Pyenv works on Linux and Mac and there is a Pyenv-win fork for Windows. - on my workstation
python
is set to Python 3 - if you identify something still calling
python2
, schedule work to update or remove that dependency
You may prefer a different solution, my real advice here is just that you have to have a solution because you can not blindly rely on pre-installed versions of Python.
Packages and package scope
The basics for packages:
- PyPI is Python’s go-to package registry
- PIP is the standard tool for installing PyPI packages
venv
is a module in Python that creates virtual environments
PIP has been bundled with Python since v3.4 and uses requirements.txt
as its standard manifest file (called with pip3 install -r requirements.txt
). It doesn’t separate dev and production dependencies in a single manifest, although you can have multiple manifest files and install different ones in different environments.
PIP installs packages globally in the active environment – by default that means globally on the host system. To constrain dependencies to a specific project, you need to create a virtual environment, then install dependencies and run the app inside the virtual environment.
The virtual environment is created with venv
and typically placed in a subdirectory named env
. You explicitly create, activate and deactivate the environment.
A naive shell script to illustrate this process:
echo "Creating environment"
python3 -m venv env
echo "Activating environment"
source env/bin/activate
echo "Installing dependencies"
pip3 install -r requirements.txt
echo "Starting app"
python3 app.py
echo "Deactivating environment"
deactivate
You can obviously create smarter scripts, or use an alternative. You should also be aware that it’s possible to create nested virtual environments (env
inside an env
).
Alternatives to PIP and venv
It’s no big surprise that there are PIP alternatives, or that many are designed to manage virtual environments for you.
Some popular options:
- Poetry is a package, dependency and environment manager. It uses a .toml manifest file instead of requirements.txt, generates a lockfile and differentiates dev and production dependencies. It will either use a supplied virtual environment or create one for you. Poetry is likely to feel familiar to NodeJS devs.
- Pipenv is a package, dependency and environment manager with first-class support for Windows. It uses a
Pipfile
instead of requirements.txt and generates a lockfile. - Conda is a package, dependency and environment manager “for any language” and not just Python.
There are plenty of posts out there arguing for each of these tools; and more options to boot.
I’d suggest you consider them mostly in terms of philosophy – eg. do you prefer to use standard tools managed with a script, or do you prefer to set up non-standard environments that handle things for you? Do you prefer separate tools or all-in-one solutions?
One iron clad rule I can give you is do not mix PIP, Poetry, Pipenv or Conda in any given project – pick one and stick to it. I’ve seen a few people mix them and they Have A Bad Time™.
Runtimes in Linux and Docker
While a runtime manager is fine for your workstation, you probably don’t want to use one in CI/CD. The gotcha is the python3
package on many linux distros isn’t the latest version of Python 3.
The solution usually boils down to adding another source/PPA to upgrade python3
to your desired version, or install a separate package for the point release and use that. The details vary by distro. eg. on Debian you need to add the testing repos, Alpine needs a different pip package and Ubuntu uses the deadsnakes PPA
If you are using Docker, the official Python images are great. If you need NodeJS as well as Python, the official NodeJS images are great but you’re back to the problems of trying to install Python. Better to use the methodology in nikolaik’s docker-python-nodejs dockerfile, which is to use an official Python base image and install NodeJS (it’s easier that way around).
If you have a build that still needs Python 2, the good news is you’re probably just an apt-get
away from 2.7. You will want to weed out whatever still needs Python 2, but at least the quick fix is actually quick.
Where to from here?
There are a lot more things you will need to learn, I’ve just run through the ones I’ve seen trip a few people up. From here, there’s also a lot of learning Python itself. Since it’s a multi-paradigm language used for just about everything, what you will specifically need will vary accordingly. So I’ll finish up with a few quick notes around syntax and some links to things you might need.
To call out one single resource, in some ways The Hitchhikers Guide To Python is the resource I was looking for. But it is pretty huge and I’m not sure I’d have been able to digest it without a bit of prior orientation. Since I found it late in the process of writing this post, I’ll never know.
Syntax and convention
While this post isn’t trying to be a language tutorial, there are some aspects of syntax and convention that help illustrate the general Vibe Of The Thing.
python -c "import this"
Idiomatic Python is commonly described as “pythonic”, and to understand the pythonic way of things PEP-20 and PEP-8 are a good start.
You can’t talk about Python without noting it has whitespace significant syntax. Get your indentation wrong and things break. This upsets a lot of people. It doesn’t really bug me, although you do need some decent syntax highlighting in your editor.
The next obligatory observation is that, appropriately, Python loves Snake Case. So get_used_to_underscores_you_will_see_them_a_lot
.
A common convention in Python is matching indentation to an opening delimiter, a style set out in PEP-8:
foo = long_function_name(var_one, var_two,
var_three, var_four)
I find this one pretty strange, and it quickly becomes unreadable for UI code as the nesting adds up so fast.
Thankfully PEP-8 is not as inflexible as people often assume is is. PEP-8 starts with a warning that A Foolish Consistency is the Hobgoblin of Little Minds and calls out the need to break the rules when they would make code harder to read.
It doesn't matter in this particular case, as PEP-8 explicitly allows hanging indentation anyway:
foo = long_function_name(
var_one, var_two,
var_three, var_four)
On a few occasions I’ve had to assure people this is acceptable within the style guide, usually because their UI code was becoming a nightmare.
It is worth spending a little time learning about args, kwargs (keyword arguments) and positional arguments for functions. They are useful and powerful, but I wouldn’t describe them as intuitive (...maybe if I were Dutch?).
There are plenty of other syntax quirks, but these are the big ones that most people seem to comment on.
Module imports
Python’s module system lets you import anything from a .py file, using the filename as the module name. So myclevermodule.py
can be imported as myclevermodule
.
The newbie mistake here is that since any file at all can be an import, you can accidentally override a built in module if you happen to create a file with the same name.
I ran into this while drilling on language basics – I created types.py
to do some exercises with Python types, and that file worked just fine. But other files in the same directory suddenly blew up, as they were trying to find Python’s type system in my types.py
file.
There doesn’t seem to be much protection against this, despite being cited as one of the more common traps for new players. Most blog posts still boil down to “just…err…don’t clash with the built-in python modules…”. So, either memorise everything in the Python Module Index or namespace all of your .py files.
A few useful links
The official language resources are pretty good:
Some good unofficial resources:
And finally a grab-bag of potentially useful things that were too big to include in this post:
- You may want to look into Dash, Flask, deploying to WSGI servers and Gunicorn
- If you need multi-threaded apps, read up on the Global Interpreter Lock
- You might find Black and Pylint useful
- If you want a dedicated Python IDE, try Pycharm
Last thoughts
Most ecosystems have some tribal knowledge and traps for new players, and Python’s no different. I still consider myself a python newbie, and as per the big disclaimer it’s not “if” something here is wrong but when I realise.
But at least I can manage the basics now; and can spend more time focused on actually doing stuff. Hopefully this post will help someone else get to that point a bit quicker.