Cover image for Cross compiling in the Conda ecosystem
Written on

Cross compiling in the Conda ecosystem

Wolf Vollprecht
Wolf Vollprecht

At prefix.dev we are heavily invested in the Conda ecosystem and try to make it better with our fast build tool, rattler-build! With this blog post we want to demistify how you can build an application from one architecture to another, and what host and build requirements mean, exactly.

Cross-compilation is a fundamental capability in modern software development, allowing developers to build packages for different architectures without needing access to the target hardware. In the Conda ecosystem, this becomes particularly powerful when combined with rattler-build, our fast alternative to conda-build. Let's dive into how cross-compilation works in rattler-build and explore the concepts of build and host environments.

Understanding the Cross-Compilation Landscape

Cross-compilation involves building software on one platform (the build machine) that will run on a different platform (the host machine). This is increasingly important as we see more diverse architectures in production environments—from ARM-based servers to Apple Silicon laptops.

If everything is configured correctly, cross-compilation with rattler-build is as easy as saying:

rattler-build --target-platform={linux-aarch64, osx-64, ...}

In the context of Conda packages, cross-compilation allows us to:

  • Build packages for ARM architectures on x86_64 CI runners

  • Create packages for different operating systems

  • Support embedded systems and specialized hardware

  • Reduce build times by leveraging powerful build servers. One could also build the software using emulation, but that is usually much slower than using cross-compilation which can run at native speed

Build vs Host Environments in rattler-build

The distinction between build and host environments is crucial for successful cross-compilation. This terminology has its roots in GNU Autotools, which established the convention of three distinct platforms in cross-compilation scenarios:

  • Build: The platform where the compilation is happening

  • Host: The platform where the compiled code will run

  • Target: The platform that the compiled code will generate output for (only relevant for compilers and similar tools)

In the Autotools world, these are specified with flags like --build, --host, and --target. For example, when cross-compiling GCC itself, you might use:

./configure --build=x86_64-pc-linux-gnu \
            --host=aarch64-linux-gnu \
            --target=aarch64-linux-gnu

The Conda ecosystem adopted this terminology but simplified it for most use cases, focusing primarily on build and host environments since most packages aren't compilers themselves. Let's break down what each environment represents:

Build Environment

The build environment contains tools and dependencies that need to run during the package building process. This includes:

  • Compilers (GCC, Clang, Rust compiler)

  • Build tools (CMake, Make, Cargo)

  • Code generators and preprocessors

  • Any tool that executes during the build

Host Environment

The host environment contains libraries and headers that the package being built will link against. These dependencies:

  • Must be compatible with the target platform

  • Include shared libraries, static libraries, and development headers

  • Run exports take care that they are used by the final package at runtime (if they're shared libraries)

Here's a practical example of a rattler-build recipe that demonstrates this separation:

requirements:
  build:
    - ${{ compiler('c') }}
    - ${{ compiler('cxx') }}
    - cmake
    - ninja
    - pkg-config
  host:
    - zlib
    - openssl
    - boost-cpp
  run:
    # - libzlib (automatically added as run export from `zlib`)
    # - openssl (automatically added as run export from openssl)

The Compiler Function Magic

One of rattler-build's elegant features is the ${{ compiler() }} function. When cross-compiling, this function automatically selects the appropriate cross-compiler for your target platform.

For example, when building from linux-64 to linux-aarch64:

rattler-build build --target-platform linux-aarch64 recipe.yaml

The ${{ compiler('c') }} function will resolve to a cross-compiler package like gcc_linux-aarch64 that:

  • Runs on linux-64 (your build platform)

  • Produces binaries for linux-aarch64 (your target platform)

This means the actual package selected might be something like gcc_linux-aarch64 which is built to run on the build platform but targets the host platform.

Compiler Activation Scripts

The magic behind Conda's cross-compilation support lies in compiler activation scripts. These scripts are part of the compiler packages and automatically set up the build environment with the correct flags and paths.

When a compiler package is installed, it includes activation scripts that:

  • Set environment variables like CC, CXX, CFLAGS, LDFLAGS

  • Configure the sysroot for cross-compilation

  • Set up paths to the correct headers and libraries

  • Define target-specific compiler flags

For example, a typical activation script might set:

export CC=$PREFIX/bin/x86_64-conda-linux-gnu-gcc
export CXX=$PREFIX/bin/x86_64-conda-linux-gnu-g++
export CFLAGS="${CFLAGS} --sysroot=$PREFIX/$HOST/sysroot"
export CMAKE_PREFIX_PATH=$PREFIX/$HOST/sysroot/usr:$PREFIX

These activation scripts ensure that both the build and host environments are properly configured during the build process.

You can find the sources for the activation scripts in the compiler repositories on conda-forge. And for macOS / Clang here.

Cross-Platform Testing

One of the most innovative aspects of rattler-build is its approach to cross-platform testing. The tool provides separate build and run environments within the test section, enabling you to test cross-compiled packages even when you can't natively execute them.

Test Environment Structure

tests:
  - script:
      - test -f $PREFIX/lib/libmylib.so  # Runs in build environment
      - python -c "import mymodule"       # Would run in host environment
    requirements:
      build:
        - python
        - pytest
      run:
        - python
        - numpy

This separation allows for sophisticated testing strategies:

  1. Build-time tests: Verify that files are installed correctly, check pkg-config files, validate metadata

  2. Runtime tests: When possible (using emulation or actual hardware), run the compiled binaries

  3. Cross-platform validation: Use QEMU or other emulators to test ARM binaries on x86_64 systems

Leveraging QEMU for Cross-Platform Testing

rattler-build can utilize QEMU for testing cross-compiled packages. The CROSSCOMPILING_EMULATOR environment variable points to the appropriate QEMU binary:

tests:
  - script:
      - if [[ "$CONDA_BUILD_CROSS_COMPILATION" == "1" ]]; then
          $CROSSCOMPILING_EMULATOR python -c "import mymodule; mymodule.test()"
        fi

Conditional Dependencies

Sometimes you need different dependencies when cross-compiling:

requirements:
  build:
    - ${{ compiler('c') }}
    - ${{ compiler('cxx') }}
    - if: build_platform != target_platform
      then:
        - qemu-user-static  # For testing
        - cross-python_${{ target_platform }}
  host:
    - python
    - numpy

Best Practices for Cross-Compilation

  1. Always distinguish build and host dependencies: Tools that run during build go in build, libraries to link against go in host

  2. Use compiler functions: Let rattler-build handle compiler selection through ${{ compiler() }}

  3. Test thoughtfully: Design tests that can validate your package without necessarily running target binaries

  4. Handle environment variables: Some build systems need hints about cross-compilation:

    if [[ "${CONDA_BUILD_CROSS_COMPILATION}" == "1" ]]; then
      export CMAKE_CROSSCOMPILING=ON
      export CMAKE_SYSTEM_NAME=Linux
      export CMAKE_SYSTEM_PROCESSOR=${target_platform##*-}
    fi
  5. Document platform-specific behavior: Make it clear in your recipe when certain features are only available on specific platforms

  6. Find examples on Github: The powerful search on Github can help a lot to find good examples for cross-compilation. Just search for "CONDA_BUILD_CROSS_COMPILATION" in the conda-forge org.

Reproducible Cross-Platform Builds

Cross-compilation is a first-class citizen in the Conda ecosystem. The clear separation of build and host environments, combined with sophisticated testing capabilities and automatic compiler activation scripts, enables developers to:

  • Build packages for multiple architectures from a single CI/CD pipeline

  • Reduce infrastructure costs by centralizing builds

  • Ensure consistency across platforms

  • Test packages more thoroughly before release

We want to make cross-compilation withrattler-build extremely simple (just set --target-platform=...). Do let use know if you have feedback.

Another cool use case for cross-compilation is the emscripten-forge – where everything is cross-compiled to WebAssembly.

Chat with us on Discord if you have any questions: https://discord.gg/kKV8ZxyzY4