Cover image for Mamba 1.2 Release

Mamba 1.2 Release

Wolf Vollprecht
Written by Wolf Vollprecht 2 years ago

A list of the highlights of the mamba 1.2 release!

Note

Hint: To update to the latest micromamba, run micromamba self-update. And if you don't like it, you can also downgrade by running micromamba self-update --version 1.1.

If you don't have micromamba yet, run curl micro.mamba.pm/install.sh | bash on Linux and macOS or in Git Bash on Windows.

micromamba now supports a experimental zst encoded repodata files for much faster download speeds from Anaconda servers. This is especially relevant for users with high internet speeds. Previously the download speeds were limited to ~3 MB/s for the standard repodata.json files which were on-the-fly compressed with gzip. However, the zstd compressed files can reach transfer speeds of 20MB/s or higher which makes downloads much faster!

Faster micromamba repodata downloads!

To enable this feature, add the following to your ~/.condarc or ~/.mambarc file:

repodata_use_zst: true
# optionally to preconfigure channels where you know that they have a
# repodata.json.zst file available (skips an initial check)
# conda-forge is by default configured!
# repodata_has_zst:
# - https://conda.anaconda.org/conda-forge  # default
# - ...

Note: you can also use micromamba config set repodata_use_zst true to enable this feature!

Furthermore, we've optimized how the .conda packages are decompressed. Previously we were extracting the outer (uncompressed) zip file into a temporary directory, and then decompress the inner zstd compressed files. Now we are streaming the contents of the zip layer directly into the zstd decompressor, without a temporary directory in between. This should reduce space usage and also speeds up decompression.

We found an issue in the libarchive library that makes the creation of "sparse" files exceedingly slow on macOS. Sparse files are more useful in the context of backups – as we do not expect many package files to contain sparse files (files that contain lots of "\0" bytes) we have decided to disable the sparse extraction feature of libarchive for further speed gains.

With these changes, running micromamba package extract scipy-1.10.0-py311h939689b_0.conda with 1.2 is faster than both micromamba 1.0 and conda-package-handling:

Packageµmamba 1.2µmamba 1.0cph 2.0.2
scipy-1.10.0-py311h939689b_0.conda599.5 ms748.8 ms618.0 ms
jaxlib-0.4.1-cpu_py38h6beaf4d_1.conda227.2 ms463.7 ms317.8 ms

Note

Hint: These benchmarks were performed with hyperfine. You can find the raw results here. The results were obtained on a M1 MacBook Pro.

To round things up, we've also added multithreaded compression for the .conda format to micromamba (you can actually run micromamba package compress and micromamba package transmute to create or transmute conda packages). To set the number of threads use --compression-threads. We also follow some other nice improvements in the new conda-package-handling library: cleaning out the UID, GID, UNAME and GNAME in the archives for better reproducibility.

In order to get closer to "reproducible" archives, we have also settled on a deterministic order in the archives – hopefully we can write an enhancement proposal about that soon!

The last step to reproducible archives is to fix up timestamps when packaging. After that it could be possible to produce some reproducible archives! To read more about this topic, follow this link: https://reproducible-builds.org/docs/archives/

Some other smaller changes and enhancements:

  • Use new shebang style for Python entrypoints to make prefix's with spaces work better – thanks to Jaime Rodríguez-Guerra for figuring this out!
  • Work around some issues with cyclic symlinks found in a particular package
  • Fix an issue with setting last write time on certain file systems (thanks to @coroa)
  • This release also saw many changes in the CI setup which should be more stable now.
  • And many more – for the full list have a look at the CHANGELOG