ChannelsWhat are they and how are they used?

Channels are collections of software and their dependencies. A mamba channel has to come with a repodata.json file that is the index of all available packages. A channel also has a "architecture" or "subdir". Commonly used subdirs are:

  • noarch (this architecture works on all operating systems and exists for all channels)
  • linux-64, linux-aarch64, linux-ppc64le: the linux-64 architecture (note - most channels require glibc to be present on the host system (There is no musl support as of now).
  • osx-64 and osx-arm64: the subdirs for macOS intel and Apple Silicon
  • win-64: the subdir for Windows

The URL layout of a channel is:

  • repodata:[channel-name]/[subdir]/repodata.json
  • package:[channel-name]/[subdir]/[package]-[version]-[buildstring].tar.bz2

The most popular and most widely used channel is the conda-forge channel. The conda-forge channel contains over 19,000 different packages. It is maintained by a large group of volunteers on Github. There are over 4,000 individual package maintainers for conda-forge, and over 20 "core"-members that steer the project. The conda-forge project publishes software packages for all mentioned architectures and has a lot of packages from the Python and R communities available.

How does the conda-forge channel operates

The packages in the conda-forge channel are all built on public CI services (such as Github Actions or Azure Devops). That has some nice benefits: it's easier to publish software for multiple operating systems and it makes it somewhat reproducible and traceable (as opposed to individual developers uploading binary artifacts).

It is interesting to note that, contrary to PyPI or NPM, package maintainers on conda-forge are often not the original developers of the packaged software. Conda-forge functions more like a Linux distribution where many package maintainers try to assemble a well-working package distribution.

Each package has an associated "feedstock". A "feedstock" is a Github repository that contains the "recipe" for the package. The recipe specifies all dependencies needed to build and run the software, and a build script of the commands that are necessary to compile and assemble the package.

If you want to publish your own open source software, conda-forge might be an ideal place. More information about the process can be found in the conda-forge documentation.

Another thing to know about conda-forge is that it is a "rolling distribution". Updates to underlying (low-level) packages are rolled out to the entire set of packages automatically with so-called "migrations". For example, the boost-cpp library is a widely used low-level C++ project. To make sure that projects published on conda-forge and that rely on boost-cpp are compatible with each other, they should all use the same version. To achieve this, there is a special global pinnings file. Whenever the pins change, all packages that use boost-cpp will receive an update from a migration that sends a PR requiring the newer boost-cpp version.

What is in the repodata?

The repodata contains informations about each package and itsit's dependencies. The different fields are:

  • name: the name of the package
  • version: the version of the package. Package versions are strings that usually contain numbers and dots (e.g. "3.10.0). Note that it's somewhat tricky to compare package versions directly – one needs to split the string first and then compare the individual parts. The mamba version ordering is also described LINK TO MAMBA VERSION ORDERING.
  • build: the "build string" of a given package is usually a hash and a build number. A given package + version can have multiple builds, either because they are variants of the same package (a variant is the same package but with different compile-time options or dependencies) or because the package maintainers chose to fix something by publishing by creating a new build of the same package version
  • dependencies: these are the dependency specifiers for the package, with a specific syntax. For example, a given package might require numpy >=3.0,<4 which would add numpy with a version between 3 and 4 (not inclusive) to the target environment
  • constrains: A package can have a constraint on another package. This works similarly to a dependency, but does not automatically install the package into the environment. It functions like an optional dependency: if the dependency is install (on explicit request by the user or because another package pulls it in) it has to be of this version/version range
  • license: this field is mandatory for the conda-forge distribution and contains the SPDX identifier of the license (e.g. BSD-3 for BSD 3 Clause, LGPLv2 for Lesser Gnu Public License v2...)

Other fields exist but are less important or duplicate information.