Cover image for Enhancing the Conda Ecosystem in 2025

Enhancing the Conda Ecosystem in 2025

Wolf Vollprecht
Wolf Vollprecht

We started the new year with four exciting Conda Enhancement Proposals (CEP) and want to share them with the wider community. In 2024, we have already successfully proposed five important CEPs - sharded repodata, and the v1 recipe format in four different CEPs.

In 2025, our commitment to modernizing the Conda ecosystem is stronger than ever. We want to keep the cadence up and improve the ecosystem much further!

Optional dependencies, conditional dependencies, and flags

Link to the CEP

This is one of the biggest CEPs and would mark a huge improvement to the Conda ecosystem. We want to introduce three new features to the repodata & matchspec:

  • Optional dependency sets are mirroring PyPI extras (https://peps.python.org/pep-0508/). That means a package can declare optional dependency groups. Take for example sqlalchemy[extras=[postgres, mysql]]. In this case, sqlalchemy would automatically instal all dependencies defined in the postgres and mysql group.. In this specific example, this means that all dependencies are installed that are needed to use sqlalchemy together with postqres and mysql.

    To note: in the conda ecosystem this can already be modelled by building additional packages, such as sqlalchemy-postgres or sqlalchemy-mysql , that would have a dependency on sqlalchemy. However, by using extras we can save space in the repodata and make it much easier to build and mirror packages from PyPI.

  • Conditional dependencies are dependencies that are only activated under certain conditions. For example, a Python package could depend on pywin32, but only on Windows. This is something that is also already possible in the PyPI ecosystem. In Conda, some packages are working around this by creating multiple variants where one depends on __unix, and the other on __win to make sure they can be only installed on either Windows or Unix.

    We propose to extend the matchspec syntax to also allow conditions such as python >=3. The proposed syntax is:

    name: sqlalchemy
    version: 1.0.0
    depends:
      - python >=3.8  
      - pywin32; if __win
      - six; if python <3.8
  • Finally, we propose to introduce the concept of flags to make it easier to select a specific variant of a package. A certain package can advertise that it was compiled or configured with certain flags, such as gpu:cuda, release or debug, mpi:openmp, or blas:mkl.

    Currently, this is done by either mutex packages or by using globs on buildstrings, such as *_cuda. This is not very ergonomic though. With flags, we propose simple matching on the flag values using a syntax like numpy[flags=["blas:blis", "gpu:*"]].

    We are also proposing a ? and ! operator. The ! operator would negate a flag and deselect any variant that has this flag. The ? operator would include that flag, if any of the packages has this flag, and is what we could use for "global" flag selection (it would not be an error to try to select ?blas:mkl on packages that don't have this flag at all).

A simple sigstore predicate for conda packages

Link to the CEP

We want to implement package signing and package attestations for the conda ecosystem, and would like to standardize this around sigstore. Sigstore is already adopted by a number of other big open source ecosystems such as Python / PyPI, Ruby, ...

With the proposed CEP we merely want to set a baseline standard of what information we want to include (and validate) as part of the attestation process.

When signing with sigstore, the certificate already includes a lot of information about the identity of the signer (e.g. a Github workflow, Git commit hash, Github identity, and so on).

The default in-toto predicate already contains one (or more) subjects that are being signed which includes the package filename and SHA256 hash. In addition we are proposing to add a targetChannel field to indicate what channel the package was (initially) meant for. This will be additional metadata that has to be validated on the server.

The cache output for v1 recipes

Link to CEP

Power users of conda-build often create "split" packages, in order to ship packages that contain only what you need. This can be facilitated with the "cache" output in v1 recipes where - for all outputs of a recipe - a common "pre-build" step is factored out. This step executes first, and other outputs can pick and choose what they need from that initial step.

That makes it easy to build a C++ package that splits the library and headers into two packages such as foobar-lib and foobar-headers, where most packages will only need the -lib output at runtime, thus reducing the footprint of the package at runtime.

Additionally, the outputs can also continue from where the pre-build has left off – for example in order to build Python bindings for different Python bindings for a common C/C++ core library.

There are a number of interesting challenges that we are trying to solve in the CEP, specifically around the run-exports which need to be added from the pre-build step to the final outputs.

run-exports in (sharded) repodata

Link to CEP

A given conda package comes with some metadata, such as its version, license, dependencies, constraints, and so on. This data is all indexed and accessible in the repodata.json file. This file has grown in size, for example for conda-forge it's currently at ~200 Mb for linux-64. This big size has made the conda project reluctant to add additional metadata to this file, and the so called run-exports have never been added to the repodata. Run exports are only needed at build time and not at runtime, so only conda-build and rattler-build need them.

However, with sharded repodata these parameters change. Sharded repodata caches extremely well, and we only download what we need. The individual shards are also heavily compressed and run exports will compress extremely well (as they are usually very similar to the package name / version). Adding run exports to the shards will make building and retrieving metadata about "source" packages a lot faster which is hugely important for our use case in pixi build.

Watch the video

We are obviously excited about all these new features. If you want to chat more about the future of the Conda ecosystem you can join us on Discord or chat with us on the Conda Zulip.

We also run a live show & tell every other week on our Discord - you can watch the episode about the CEPs here:

Still image of the video