To upper bound or not – the Python packaging debates

Wolf Vollprecht

February 15, 2023

The packaging world operates with two philosophies: the self-publishing philosophy used on PyPI, npm, crates, and other language-specific package management systems, and the philosophy of a distribution run by "professional" volunteers like Debian, Fedora or conda-forge, who ensure that all packages run well together. There are a couple of differences between these two approaches, the major ones are as follows.

When self-publishing, releases are usually immediate when the software author decides to publish a new version. In a distribution, the volunteer community needs to pick up the release and create a new package for it, which takes time and usually lags a bit behind.

On the other hand, a package distro can apply packaging issues can be fixed for all packages immediately because there is a central repository index and global control. In a self-publishing model, packaging fixes are usually applied with some delay and by the individual developers. In the self-publishing case, the package maintainer is usually also the software author – which means they sometimes lack specific expertise in software packaging.

In a distro, the packages are generated on hosted infrastructure in a reproducible fashion. When self-publishing, authors can decide where they want to run the publishing step, for example in a CI system or on their local computer.

Two major players in the “Python packaging world” are PyPI and conda-forge. The Python package index is a classic example of self-publishing, while conda-forge is a proper package distribution with a core team that maintains the consistency of the entire distribution. However, unlike some Linux distributions, the obstacles to become a package maintainer in the conda-forge ecosystem are pretty low (you just need to get a recipe merged in). The entire packaging community in conda-forge comprises over 4500 individual contributors.

The conda-forge core team consists of roughly 20 people who have "admin" access to the repository and can make far-reaching changes, such as deciding on a new compiler version or what global flags are to be used. Achieving consensus in the core team can take longer (compared to self-publishing), which can make certain release processes slower. However, the quality of such decisions is hopefully better!

Constraining dependencies with upper bounds

Versioning software is an art in itself and has been intensely debated in the past. Two approaches are dominant these days: semantic versioning (SemVer) and calendar versioning (CalVer)

Semantic versioning tries to help humans to understand how the software has evolved. Major releases (e.g. jumping from 1.4.2 to 2.0.0) notify users of the software that some API interfaces have changed. However, just incrementing the minor or patch version (1.4.2 to 1.4.3) should mean that no API breaks have happened.

In this sense, library authors who want to ensure that their software & dependencies keep working for a long time might want to add "upper bounds". For example, if a project depends on NumPy, it would want to constrain the numpy version to numpy >1.4,<2. That means, the project and NumPy 2.0 cannot be installed as-is.

A problem that was intensely debated in the Python packaging world is whether upper bounds should be used or not. The problem boils down to not knowing about the future. You can read a very long and detailed article by Henry Schreiner here

The key takeaways are that upper bounds are considered harmful because they cannot be overridden by users. A single (bad) upper bound will keep whole environments limited to older versions of packages (even though there might not even be an incompatibility). In self-publishing ecosystems like PyPI, centrally managing (and changing) the upper bounds of existing packages is not possible, so previously published packages with bad upper bounds can also not be modified.

On the other hand, upper bounds are no big problem in a “distro”. One of the most powerful features in the conda-forge toolbelt is repodata-patching

which allows maintainers to change dependencies and dependency pins as they see fit. For example, if NumPy would release a version 1.5 tomorrow that would break every package that depends on NumPy, conda-forge maintainers could simply add a strict upper bounds to all packages in the repository and prevent them from being installed with NumPy version 1.5. This is not possible on PyPI where such centralised enforcement does not exist.

What if ... we introduce some magic? :magic_wand:

The problem of upper bounds is still real – even in the conda-forge world where it is generally fixable (with lots of human labor). At prefix.dev, we are currently asking ourselves if there is a third way: what if, instead of resolving against handcrafted version ranges, we can resolve against the actual library symbols? Is it possible to automatically determine what functions and constants a package is using, and figure out compatible version ranges based on that?

This has the potential to help library authors tremendously. Instead of guessing compatibility ranges, the computer does the hard work. It would ensure that the constraint solver not only finds valid solutions, but also solutions for packages that are actually working well together.

We are currently working on tooling to index packages (starting with Python and C/C++) and record all their exported & imported functions. Related work has already been done in the past:

Red Hat: libabigail

can check the ABI of libraries
ABI Laboratory: https://abi-laboratory.pro/
records changes in the ABI of libraries
The symbol management effort: https://github.com/symbol-management

(and a very nice talk by CJ Wright about it: https://www.youtube.com/watch?v=lycforqL59)

This is a big project and we're always looking for help. We are also ready to have some discussions on all this on our Discord – come join us.

We are looking forward to report back with some initial results soon!