Cover image for Adopting uv in pixi

Adopting uv in pixi

Wolf Vollprecht
Written by Wolf Vollprecht 4 months ago

Our mission at Prefix is to radically improve the life of Conda users, which is why we are building pixi. Conda users often add pip dependencies in their environment.yml files. We want to make the integration of pip and conda seamless and allow users to manage their pip dependencies as well. In the past, conda literally just called pip as an external command to install dependencies from PyPI into the environment. With pixi we always wanted to go further. Our vision is to natively integrate a pip-style package resolver. Pixi lockfiles should not only record conda packages, but also lock down versions of PyPI dependencies.

This can only be achieved by deeply integrating the resolver into pixi. To achieve this we built rip – our "barebones" PyPI resolver that we integrated into pixi. But now that uv has entered the scene, it's time to reconsider!

We're already working on integrating uv into pixi! rip can definitely compete in benchmarks with uv – however, feature-wise it makes more sense to integrate uv as it will enable editable installs, git & path dependencies right away. Integrating with uv also lets us focus on tackling the most important issues in the conda ecosystem.

Read on to check our benchmarks of uv vs rip – and why we are still switching to uv.

Benchmarking rip and uv

A flamegraph of installing apache-airflow with rip looks like this:

A flamegraph of rip - courtesy of cargo-flamegraph

A major difference between uv and rip is the resolver. We use our own library, called resolvo, which is extremely performant when resolving conda and pip packages.

We recently made some cool improvements in resolvo by making it (optionally) async – that made rip up to 3x faster for PyPI indexes (where metadata needs to be retrieved lazily over the network). We have plans to use this for conda-indexes as well, though (think about sparse indexes for smaller downloads).

However, rip is slower than uv. If we look at the flamegraph, we can observe that a good part of "being slower" boils down to the following differences:

  • The cache that rip uses stores wheel files (which are effectively zip files). At every installation we pay the price of extracting the wheel. uv stores the extracted files which means it just needs to "link" them into the environment. We actually do this in mamba and rattler for conda packages, but we haven't yet gotten around to implement it in rip.
  • We check all SHA256 sums of the RECORD files when installing. Calculating the SHA256 hash of a file is a relatively costly operation but it helps in order to detect that the cache file is correct. However, this takes up about 25% of the entire installation time. uv currently skips any SHA256 computation.
  • When requesting metadata from the PyPI index, rip follows the pip rules about cache timeouts. The pip developers decided that users always expect fresh packages when asking the server (e.g. right after you uploaded a package it should be installable). For this reason, pip and rip use a Max-Age=0 for the HTTP request to retrieve new packages. However, this incurs a significant overhead when resolving. uv follows the max-age as reported by the server (10 minutes). That means if you have a warm cache, the solver can get to work immediately and without any extra requests.

We would be able to fix all of these issues (some are even trivial to fix), and resolvo is performing excellently.

And interestingly, since both the rip and uv resolvers share similar theoretical SAT foundations, it means that they share some pitfalls. When resolving urllib3 botocore with either rip or uv, it takes a significant time because both resolvers try all versions of botocore before testing another urllib3 version.

Python packages are pretty thorny

Unrelated to the resolver, building a Python package manager is not easy. There is a lot of historical baggage, like old sdists with files. Some of the challenges:

  • sdists, that need to be built during the solve step, possibly building multiple versions while backtracking
  • some packages are not even in the required archive layout (hexdump, looking at you)
  • some packages ship with outright wrong RECORD files, e.g. some SHA256 checksums are just wrong
  • pre-release handling is somewhat underspecified

We already dealt with a bunch of these corner cases, but ultimately it doesn't seem very worthwhile for the Python ecosystem to have two competing Rust implementations that deal with Python packages and all their corner cases.

As a conclusion, we are very excited to integrate uv into pixi. It will quickly open the door to many highly anticipated features – such as editable installs and git & path dependencies in pixi and will make our PyPI integration more feature-complete. We've already started working on the integration and expect it to land in a few weeks.

This also means that we can refocus some of our energy to get the things done that need to get done in the conda ecosystem. We are actively working on some of the most important "conda enhancement proposals" (think PEP, but for Conda) to change the way conda package recipes are written. We also have a few more important enhancements coming up for standardization: optional dependencies, smaller repodata files, and many ideas to push the conda ecosystem forward.

Our goals for pixi

Our goals for pixi remain unchanged – we want to solve dependency hell. Pixi can manage your project-scoped and global software installation. We have excellent docs here: Installing pixi is a simple one-liner (or from conda-forge, brew, winget, arch, ...):

curl -fsSL | bash
# or on Windows PowerShell
iwr -useb | iex

On a global level, you can install tools that you would otherwise download from apt-get, Homebrew or WinGet - just much faster and with the same tool on all platforms.

When you add a pixi.toml file to your project you get:

  • automatic lockfiles (like poetry, cargo or yarn)
  • a powerful task system to have simple entry points to compile, install, and run your packages and workflows
  • extremely fast installations of conda packages and the entire open source conda-forge ecosystem at your fingertips. conda-forge contains lots of your favorite tools

Working on a project should be as simple as:

What are conda packages

A lot of people associate conda packages with (only) "Python", but that is pretty far from the truth. Conda packages can contain all kind of software (such as R, C++, Java, etc.) and are not limited to Python. Usually, conda packages contain shared libraries, and as such conda is much closer in spirit to a linux distribution like Ubuntu/Fedora (e.g. apt-get or yum) than to pip.

That is why conda-forge does not only ship Python packages, but also the Python interpreter and all its dependencies (such as bzip2, openssl etc.) – as well as R, Julia, and many other languages.

What's next

We have a couple of blog posts coming up: how to move from conda to pixi and how to build your own conda packages with rattler-build. Make sure to follow us on Twitter or join our Discord server.