Cover image for The new rattler-build

The new rattler-build

Wolf Vollprecht
Written by Wolf Vollprecht 2 years ago

As part of our effort to make packaging more joyful, we are invested considerably in a new package building experience. We want to make package building fast, debuggable and reproducible. rattler-build is a new tool to build cross-platform, relocatable packages.

Coming from years of experience as a conda-forge core maintainer, I think rattler-build is going to dramatically lower the burden for people packaging software. conda-build has accumulated a lot of cruft over the years and gotten fairly slow. The recipe format mixes YAML, Jinja2 and YAML comments, which makes it hard to programmatically parse. The conda-build source code is dynamically typed Python with some functions taking 10 parameters or more making it pretty difficult to follow the logic.

Rust to the rescue

With rattler-build we have created a rewrite of the core concepts of conda-build in Rust. rattler-build makes heavy use of the rattler library, which is a clean implementation of the conda package format. Being written in Rust, it is fully typed, straightforward to understand (lot's of comments, too!), and really fast.

Rattler resolves, reads and installs packages. Lately we’ve focused our efforts on also writing packages with rattler. Rust and the ecosystem of crates greatly help (such as serde for serialization and deserialization of JSON files, or minijinja for the Jinja expressions). Static typing throughout the entire codebase makes it very easy to navigate the source code.

rattler-build is a command line tool that reads a recipe file, runs the build script and produces a conda-package that can be installed with mamba, conda or rattler

  • pretty much just like conda-build would.

An improved developer experience (DX)

The following example shows how much love we have put into making this tool very developer friendly. In the tables that rattler-build prints, it shows exactly what package versions were resolved, and highlights dependencies from variants or run_exports.

A nice side effect of the Rust development is that the build tool comes as a single binary, which will make it insanely easy to setup in CI systems.

Getting started

rattler-build is available on the conda-forge channel and thus can be installed with conda, mamba or micromamba.

We have some example recipes in the repository that can be tried out immediately.

The basic command line to execute rattler-build looks like:

# installation with micromamba
micromamba install -c conda-forge rattler-build
# And then you can get going!
rattler-build build --recipe-file ./examples/rich/recipe.yaml

We have written extensive documentation on rattler-build as a "git book" that can be found under: https://prefix-dev.github.io/rattler-build.

A reproducible future?

The "holy grail" of packaging are reproducible packages, so that a package is bit-for-bit identical whether it's built in the CI or locally on the developer machine. I have always been curious how hard it would be to make conda packages reproducible. Thanks to rattler-build, we finally have a good testbed to experiment more in this direction. From the start we’ve designed the package writing features in rattler and

This means a lot of details need to be done right: sorting of the files in the archive, timestamps, and more need all to match up. The reproducible-builds.org website lists many more details.

We’ve been able to produce some simple packages in a reproducible way, and are very much looking forward to expanding on that.

Technical deep dive

What is in a conda package?

A conda package is mostly a “tarball” of files that are expanded into the installation prefix. The installation prefix is basically a UNIX prefix with folders like /lib, /share, /bin, etc.

There is one exception to the rule – which is noarch: python packages. The root of these packages contains the specially treated python-scripts and site-packages folder which are copied into the appropriate locations at installation time (python-scripts to /bin or /Scripts on Windows), and site-packages to lib/pythonX.X/site-packages on Unix and the proper location on Windows.

For non-noarch packages, the main trick of rattler-build (and conda-build) is to make packages relocatable. That means, even though they were built in a given prefix, we can install them later into any directory (prefix) we want.

There are two tricks enabling that:

  • At build time, rattler-build uses an insanely long installation prefix. Those who have used conda packages extensively might have seen placehold_placehold_placehold… repeated many times as the installation prefix (it goes up to 255 characters). The reason is that it’s easier to replace a long string with a short one than the other way around, especially when they are replaced inside of binary files (in that case, the string is replaced and padded with \0, the C string terminator).
  • ELF files and MachO files (the executable / shared library file formats on Linux, usually .so and macOS, usually .dylib) contain information about where to find the linked libraries. This information is encoded in the “rpath” of the library or executable. On both operating systems exists a special mechanism to load the linked libraries relative to their installation location. For ELF files, that is the special $ORIGIN variable that can be added with the patchelf command. For MachO files, it is the @loader_path variable that can be inserted using the install_name_tool tool. Currently rattler-build calls either tool to make the libraries easier to relocate. We handle ELF, PE (Windows) and MachO files with the awesome goblin crate.

Converting existing recipes

The recipe format of rattler-build is a tiny bit different from conda-build. The new format came out of long community discussions to find a better spec and was already prototyped in the boa tool.

For many recipes it is straightforward to convert them to the new format. Below you find an adapted recipe for the zlib library. For someone familiar with conda recipes this should immediately look familiar. Notable changes are:

  • always 100% parseable YAML
  • selectors are YAML dictionary keys (sel(...): something) and not YAML comments (# [...])
  • There are no {% set version = "1.2.3" %} because that is not parseable YAML. Instead we are using a special context key that is processed before the rest of the recipe.
# variables from the context section can be used in the rest of the recipe in jinja expressions
context:
  version: 1.2.13
 
package:
  name: zlib
  version: "{{ version }}"
 
source:
  url: http://zlib.net/zlib-{{ version }}.tar.gz
  sha256: b3a24de97a8fdbc835b9833169501030b8977031bcb54b3b3ac13740f846ab30
 
build:
  # build numbers can be set arbitrarily
  number: 0
  script:
    # build script to install the package into the $PREFIX (host prefix)
    sel(unix):
      - ./configure --prefix=${PREFIX}
      - make -j${CPU_COUNT}
    sel(win):
      - cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=%LIBRARY_PREFIX%
      - ninja install
 
requirements:
  build:
    # compiler is a special function. Also note the quoting around `{{` - this is necessary
    # to make sure that we always have a valid YAML file.
    - '{{ compiler("c") }}'
    # The following two dependencies are only needed on Windows, and thus conditionally selected
    - sel(win): cmake
    - sel(win): ninja
    - sel(unix): make

Outlook

We are aiming for broad community adoption in the conda-forge and bioconda channels. This means writing a Conda Enhancement Proposal and getting it voted on. For conda-forge, we hope that we can make a gradual move over to the new format (since it has a different file name, recipe.yaml vs meta.yaml).

There is still a lot we plan to do to continue to implement missing conda-build features - you can find the list of issues on Github. For examples, we're considering making the build scripts look more like Github actions. But we wanted to release it to the world to get your input and help make developer's day-to-day life easier. So please let us know what matters most on Discord.