Adopting uv in pixi
Our mission at Prefix is to radically improve the life of Conda users, which is why we are building pixi. Conda users often add pip dependencies in their environment.yml
files. We want to make the integration of pip
and conda
seamless and allow users to manage their pip
dependencies as well. In the past, conda
literally just called pip
as an external command to install dependencies from PyPI into the environment. With pixi we always wanted to go further. Our vision is to natively integrate a pip-style package resolver. Pixi lockfiles should not only record conda packages, but also lock down versions of PyPI dependencies.
This can only be achieved by deeply integrating the resolver into pixi
. To achieve this we built rip
– our "barebones" PyPI resolver that we integrated into pixi. But now that uv
has entered the scene, it's time to reconsider!
We're already working on integrating uv
into pixi
! rip
can definitely compete in benchmarks with uv
– however, feature-wise it makes more sense to integrate uv
as it will enable editable installs, git & path dependencies right away. Integrating with uv
also lets us focus on tackling the most important issues in the conda ecosystem.
Read on to check our benchmarks of uv
vs rip
– and why we are still switching to uv.
Benchmarking rip and uv
A flamegraph of installing apache-airflow with rip looks like this:
A major difference between uv and rip is the resolver. We use our own library, called resolvo
, which is extremely performant when resolving conda and pip packages.
We recently made some cool improvements in resolvo by making it (optionally) async – that made rip up to 3x faster for PyPI indexes (where metadata needs to be retrieved lazily over the network). We have plans to use this for conda-indexes as well, though (think about sparse indexes for smaller downloads).
However, rip
is slower than uv
. If we look at the flamegraph, we can observe that a good part of "being slower" boils down to the following differences:
- The cache that rip uses stores wheel files (which are effectively zip files). At every installation we pay the price of extracting the wheel. uv stores the extracted files which means it just needs to "link" them into the environment. We actually do this in mamba and rattler for conda packages, but we haven't yet gotten around to implement it in rip.
- We check all SHA256 sums of the RECORD files when installing. Calculating the SHA256 hash of a file is a relatively costly operation but it helps in order to detect that the cache file is correct. However, this takes up about 25% of the entire installation time. uv currently skips any SHA256 computation.
- When requesting metadata from the PyPI index, rip follows the pip rules about cache timeouts. The pip developers decided that users always expect fresh packages when asking the server (e.g. right after you uploaded a package it should be installable). For this reason, pip and rip use a
Max-Age=0
for the HTTP request to retrieve new packages. However, this incurs a significant overhead when resolving. uv follows the max-age as reported by the server (10 minutes). That means if you have a warm cache, the solver can get to work immediately and without any extra requests.
We would be able to fix all of these issues (some are even trivial to fix), and resolvo
is performing excellently.
And interestingly, since both the rip
and uv
resolvers share similar theoretical SAT foundations, it means that they share some pitfalls. When resolving urllib3 botocore
with either rip
or uv
, it takes a significant time because both resolvers try all versions of botocore
before testing another urllib3
version.
Python packages are pretty thorny
Unrelated to the resolver, building a Python package manager is not easy. There is a lot of historical baggage, like old sdists with setup.py
files. Some of the challenges:
- sdists, that need to be built during the solve step, possibly building multiple versions while backtracking
- some packages are not even in the required archive layout (
hexdump
, looking at you) - some packages ship with outright wrong
RECORD
files, e.g. some SHA256 checksums are just wrong - pre-release handling is somewhat underspecified
We already dealt with a bunch of these corner cases, but ultimately it doesn't seem very worthwhile for the Python ecosystem to have two competing Rust implementations that deal with Python packages and all their corner cases.
As a conclusion, we are very excited to integrate uv
into pixi
. It will quickly open the door to many highly anticipated features – such as editable installs and git & path dependencies in pixi and will make our PyPI integration more feature-complete. We've already started working on the integration and expect it to land in a few weeks.
This also means that we can refocus some of our energy to get the things done that need to get done in the conda ecosystem. We are actively working on some of the most important "conda enhancement proposals" (think PEP, but for Conda) to change the way conda package recipes are written. We also have a few more important enhancements coming up for standardization: optional dependencies, smaller repodata files, and many ideas to push the conda ecosystem forward.
Our goals for pixi
Our goals for pixi remain unchanged – we want to solve dependency hell.
Pixi can manage your project-scoped and global software installation. We have excellent docs here: https://pixi.sh. Installing pixi is a simple one-liner (or from conda-forge
, brew
, winget
, arch
, ...):
On a global level, you can install tools that you would otherwise download from apt-get, Homebrew or WinGet - just much faster and with the same tool on all platforms.
When you add a pixi.toml file to your project you get:
- automatic lockfiles (like
poetry
,cargo
oryarn
) - a powerful task system to have simple entry points to compile, install, and run your packages and workflows
- extremely fast installations of conda packages and the entire open source
conda-forge
ecosystem at your fingertips.conda-forge
contains lots of your favorite tools
Working on a project should be as simple as:
What are conda packages
A lot of people associate conda packages with (only) "Python", but that is pretty far from the truth.
Conda packages can contain all kind of software (such as R
, C++
, Java
, etc.) and are not limited to Python.
Usually, conda packages contain shared libraries, and as such conda is much closer in spirit to a linux distribution like Ubuntu/Fedora (e.g. apt-get
or yum
) than to pip.
That is why conda-forge
does not only ship Python packages, but also the Python interpreter and all its dependencies (such as bzip2
, openssl
etc.) – as well as R, Julia, and many other languages.
What's next
We have a couple of blog posts coming up: how to move from conda
to pixi
and how to build your own conda packages with rattler-build
. Make sure to follow us on Twitter or join our Discord server.