Introducing Pixi's Multiple Environments

Pixi is our cutting-edge package manager for proper cross-platform, binary package management on Linux, macOS and Windows. We're proud to tell you that we have launched the multiple environment feature in pixi. This feature was designed together with multiple contributors and kickstarted by our community.

Developing software usually means running a couple different tasks. You can express these tasks beautifully with pixi: compiling the software, building the docs or running the tests – the pixi task system has got you covered.

However, these tasks might have different requirements, and some users might simply not be interested in building the documentation locally. For this reason, we are excited to ship the new multi-env feature in pixi where tasks can have their own environments. And you can build specialized environments too, for example to run with or without CUDA!

You've built what now?

Environments are pixi's bread and butter. Before release v0.13.0 we only supported one environment per project. An environment is a complete installation of all dependencies into a folder (.pixi) in your project folder. Pixi links all the required files into the correct location and – badabing badaboom – you have a pixi environment (fun fact: the links are either hard- or reflinks so that an environment doesn't take up any extra space on your disc).

These environments are platform specific and different for linux-64 vs. osx-arm64. In theory, you could also call them different environments as they might consist of different packages being installed. Which means that pixi was already solving multiple environments every time you created a project that supported multiple platforms. But these were always designed to run on a single platform and look as similar to each other as possible.

With the latest release we allow you to create more than one environment for your project. This works by introducing "features": you can create sets of packages for a feature such as "test", "dev" or any other name you come up with.

An environment can then consist of one or more of these features. This gives the pixi user the real magic-wand to tailor their environments to their specific needs.

A feature can consist of the full description of an environment, including dependencies, platform, channels and even system requirements. More details can be found in the documentation.

Combining these features allows the user to mix and match environments.

Let's start with a simple example, adding a test environment.

[project]
name = "project"
channels = ["conda-forge"]
platforms = ["linux-64","osx-arm64", "osx-64", "win-64"]

[dependencies] # read by pixi as [feature.default.dependencies]
python = "*"
numpy = ">1.23"
polars = ">=0.20"

# further constrain python to 3.10 in this feature
[feature.py310.dependencies]
python = "3.10.*"

[feature.test.dependencies]
pytest = ">=6.2"
pytest-cov = ">=3.0"
pytest-xdist = ">=2.4"
ruff = ">=0.2"

[environments]
test = ["test"] # This is read as ["default", "test"]
test-py310 = ["test", "py310"] # test against python 3.10

This test environment is only needed for a developer or CI but if you are simply running the environment you don't need to install the test environment. You can keep using the default environment with no changes!

pixi run python script.py # runs the default environment
pixi run --environment test pytest -s # runs the test environment

This is a simple example, but it shows the setup of an environment in the simplest form. You can create environments for different stages of your development process this way. Environments can consist of unlimited features.

Real dependencies stay together

I hear you thinking, but now you might select different versions of python and numpy in the test environment compared to the default environment. And you would be right, but pixi has a solution for that as well. You can define a solve-group. This is a group of dependencies that are solved together. By having the same solve-group in the default environment and the test environment, the exact same dependency versions will be in both environments.

[environments]
test = { features = ["test"], solve-group = ["group1"] }
default = { solve-group = ["group1"] }

Now the dependencies are all the same only that the test environment will install more of the total set compared to the default environment. This is really powerful where a tester needs extra dependencies but wants to test the exact dependencies that are going to be deployed.

Okay that was easy right, it kinda feels like optional dependencies or features or something that you have heard of before in other tools. But we allow you to define as much of these features and environments that you want. Testing 20 different versions of python? Simply make the environments and let pixi do the heavy lifting. Note that the key here is that every single environment will be locked in the pixi.lock file.

This means that you can always go back to the same environment that you had before. On any of the platforms you have defined. Pixi might be the first tool that allows both workflows of solving environments together or separated and then locking them down in a lockfile.

To CUDA or not to CUDA: Mlx in the mix

Let's take a look at a more complex example, one of the more famous issues in the data science world.

Features are not limited to dependencies, you can define complete environments which include system requirements. Let's say you have a machine learning project that requires CUDA or cpu specific environments based on the machine you are running on. You can define a pixi.toml that has two different environments (the default and cuda). The CPU environment will be the default environment and the cuda will be added on top of the default environment when needed.

[project]
name = "ml_project"
channels = ["conda-forge", "pytorch"]
platforms = ["linux-64","osx-arm64", "osx-64", "win-64"]

[dependencies] # read by pixi as [feature.default.dependencies]
python = "3.11.*"
numpy = ">1.23"
pytorch = {version = ">=2.0.1", channel = "pytorch"}
torchvision = {version = ">=0.15", channel = "pytorch"}

[tasks]
train = "python train.py"

# Define everything needed to create the perfect cuda environment
[feature.cuda]
# The set of platforms will be combined in the environments and
# the intersection will be taken to see which platforms are supported.
platforms = ["linux-64", "win-64"]
# pixi will check if the correct cuda version is installed
system-requirements = {cuda = "12.1"}
channels = ["nvidia", {channel = "pytorch", priority = -1}]
dependencies = {cuda = ">=12.1", pytorch-cuda = {version = "12.1.*", channel = "pytorch"}}
tasks = {train = "python train.py --cuda"}

[environments]
# all environments include the default feature
cuda = ["cuda"] # This is read as ["default", "cuda"]

This example can then be used to run code in different environments. If the cuda environment and the default environment are both available on the platform you are running on, then pixi will spawn a selector to let you choose the environment to run the task in.

➜ pixi run train # Spawns the selector for which environment to run in.
? The task 'train' can be run in multiple environments.
Please select an environment to run the task in: ›

❯ default
  cuda

pixi run --environment cuda train # Runs the train task directly in the cuda environment

Now let's get more nerdy

As if it wasn't already nerdy enough, we have a few more tricks up our sleeve. We've made all the environment solving completely parallel (thanks to resolvo, our SAT solver).

Parallelism

Speed definitely sets pixi apart from other tools around. It looks pretty cool if you have 10 environments that support 4 platforms each:

The new pixi.lock format

Each dependency corresponds to an entry in the lockfile that specifies exactly the artifact and metadata of the package. With this information we can statically verify (without an internet connection) that the lockfile satisfies all requirements described in the pixi.toml file and whether we need to re-solve. When the requirements of a project change, pixi intelligently updates just the parts of the lockfile that are out-of-date.

It also means that if you have 100 dependencies in one environment shared over 4 platforms and in 10 environments, you will have 4000 entries in the lockfile. Every entry requires a few lines of metadata, so it grows quickly. We have modified the lockfile to deduplicate entries to make the file as small as possible but still human-readable.

This resulted in a file that looks like this pseudo example:

version: 4
environments:
   default:
    channels:
    - url: https://conda.anaconda.org/conda-forge/
    packages:
      linux-64:
      - conda: https://conda.anaconda.org/conda-forge/linux-64/curl-8.5.0-hca28451_0.conda
      ...
packages:
 - kind: conda
    name: curl
    version: 8.5.0
    build: hca28451_0
    subdir: linux-64
    url: https://conda.anaconda.org/conda-forge/linux-64/curl-8.5.0-hca28451_0.conda
    sha256: febf098d6ca901b589d02c58eedcf5cb77d8fa4bfe35a52109f5909980b426db
    md5: e5e83fb15e752dbc8f54c4ac7da7d0f1
    depends:
    - krb5 >=1.21.2,<1.22.0a0
    - libcurl 8.5.0 hca28451_0
    - libgcc-ng >=12
    - libssh2 >=1.11.0,<2.0a0
    - libzlib >=1.2.13,<1.3.0a0
    - openssl >=3.2.0,<4.0a0
    - zstd >=1.5.5,<1.6.0a0
    license: curl
    license_family: MIT
    size: 94895
    timestamp: 1701860161671
...

This will still be several thousand lines long for real examples. But it is a greatly reduced size compared to the previous lockfile style mapped to the multiple environments feature. With this change we completely let go of the conda-lock style lockfile unfortunately. We are planning to eventually work on a CEP (Conda Enhancement Proposal) to standardize this new lockfile format.

Let's connect

Do you want a more detailed explanation of the speed improvements and cool graph tricks we built?

Chat with us on Discord, Twitter or Mastodon and we might write a blog about it!

Written on March 18, 2024 by:

Ruben Arts