mojo-zlib
A Mojo implementation of the Python zlib library, providing compression, decompression, and checksum functionality. This library offers a Python-compatible API for zlib operations in Mojo, enabling seamless migration from Python code.
Installation with Pixi
Make sure that you have https://repo.prefix.dev/mojo-community
in the channels
list of your pixi.toml
file.
Then, you can install the library with:
pixi add mojo-zlib
Common issues:
If your IDE tells you "Cannot find module 'zlib'": you have to go to the extensions menu, look for the Mojo or Mojo nightly extension, and click on the "Settings" icon. Then you'll find a section called "Mojo › Lsp: Include Dirs". Add the following path to the list:
.pixi/envs/default/lib/mojo
This is where Pixi adds the .mojopkg
files.
Then restart your IDE or just the Mojo extension and it should work.
Useful links:
Features
- Compression & Decompression: Full support for DEFLATE algorithm with zlib, gzip, and raw deflate formats
- Streaming Operations: Incremental compression and decompression for large datasets
- Checksum Functions: Pure Mojo implementations of CRC32 and Adler-32 algorithms
- Python Compatibility: API designed to match Python's zlib module
- Memory Efficient: Streaming operations avoid loading entire datasets into memory
Note Those bindings are not as fast and they could be. We are currently waiting for several Mojo language features to improve the speed.
Development
Run the unit tests with
pixi run test
Install the pre-commit with
pixi x pre-commit install
Quick Start
import zlib
fn main() raises:
# Basic compression and decompression
data = "Hello, World! This is a test string for compression.".as_bytes()
compressed = zlib.compress(data)
decompressed = zlib.decompress(compressed)
# Streaming compression for large data
compressor = zlib.compressobj()
data_part1 = "First part of data ".as_bytes()
data_part2 = "Second part of data".as_bytes()
chunk1 = compressor.compress(data_part1)
chunk2 = compressor.compress(data_part2)
final = compressor.flush()
result = chunk1 + chunk2 + final
# Checksum calculation
crc = zlib.crc32(data)
adler = zlib.adler32(data)
API Reference
Compression Functions
| Function | Signature |
|----------|-----------|
| compress
| fn compress(data: Span[UInt8], level: Int32 = -1, wbits: Int32 = MAX_WBITS) raises -> List[UInt8]
|
| compressobj
| fn compressobj(level: Int32 = -1, method: Int32 = Z_DEFLATED, wbits: Int32 = MAX_WBITS, memLevel: Int32 = DEF_MEM_LEVEL, strategy: Int32 = Z_DEFAULT_STRATEGY) raises -> Compress
|
Decompression Functions
| Function | Signature |
|----------|-----------|
| decompress
| fn decompress(data: Span[UInt8], wbits: Int32 = MAX_WBITS, bufsize: Int = DEF_BUF_SIZE) raises -> List[UInt8]
|
| decompressobj
| fn decompressobj(wbits: Int32 = MAX_WBITS) raises -> Decompress
|
Checksum Functions
Note that those are implemented in Mojo and thus do not require libz.so
to be installed.
| Function | Signature |
|----------|-----------|
| crc32
| fn crc32(data: Span[UInt8], value: UInt32 = 0) -> UInt32
|
| adler32
| fn adler32(data: Span[UInt8], value: UInt32 = 1) -> UInt32
|
Streaming Objects
| Object | Description |
|--------|-------------|
| Compress
| Streaming compression object with compress()
and flush()
methods |
| Decompress
| Streaming decompression object with decompress()
, flush()
, and status attributes |
Function Documentation
compress
zlib.compress(data: Span[UInt8], level: Int32 = -1, wbits: Int32 = zlib.MAX_WBITS) raises -> List[UInt8]
Compresses the bytes in data, returning a List[UInt8]
containing compressed data.
Parameters:
data
: The data to compresslevel
: Compression level from 0 to 9 controlling the compression speed/size tradeoff:0
: No compression1
: Fastest compression, least compression9
: Slowest compression, best compression-1
: Default compromise between speed and compression (equivalent to level 6)
wbits
: Window bits parameter controlling the compression format and window size:9
to15
: zlib format with header and checksum-9
to-15
: raw deflate format without header or checksum16 + (9 to 15)
: gzip format with header and checksum
Returns: Compressed data as List[UInt8]
Raises: Error
if compression fails or invalid parameters are provided
decompress
zlib.decompress(data: Span[UInt8], wbits: Int32 = zlib.MAX_WBITS, bufsize: Int = zlib.DEF_BUF_SIZE) raises -> List[UInt8]
Decompresses the bytes in data, returning a bytes object containing the uncompressed data.
Parameters:
data
: The compressed data to decompresswbits
: Window bits parameter controlling the compression format and window size:9
to15
: zlib format with header and checksum-9
to-15
: raw deflate format without header or checksum16 + (9 to 15)
: gzip format with header and checksum- Values
32-47
: automatic header detection (zlib or gzip)
bufsize
: Initial size of output buffer for decompressed data (default: 16384)
Returns: The decompressed data as List[UInt8]
Raises: Error
if the compressed data is invalid, corrupted, or incomplete
crc32
zlib.crc32(data: Span[UInt8], value: UInt32 = 0) -> UInt32
Computes a CRC (Cyclic Redundancy Check) checksum of data.
This computes a 32-bit checksum of data. The result is an unsigned 32-bit integer. If value is present, it is used as the starting value of the checksum; otherwise, a default value of 0 is used. Passing the value returned by a previous call allows computing a running checksum over the concatenation of several inputs.
Parameters:
data
: The data to compute the checksum forvalue
: Starting value of the checksum (default: 0). Can be the result of a previouscrc32()
call
Returns: An unsigned 32-bit integer representing the CRC-32 checksum
adler32
zlib.adler32(data: Span[UInt8], value: UInt32 = 1) -> UInt32
Computes an Adler-32 checksum of data.
An Adler-32 checksum is almost as reliable as a CRC32 but can be computed much faster. The result is an unsigned 32-bit integer. If value is present, it is used as the starting value of the checksum; otherwise, a default value of 1 is used. Passing the value returned by a previous call allows computing a running checksum over the concatenation of several inputs.
Parameters:
data
: The data to compute the checksum forvalue
: Starting value of the checksum (default: 1). Can be the result of a previousadler32()
call
Returns: An unsigned 32-bit integer representing the Adler-32 checksum
Streaming Objects
Compress Object
A compression object for compressing data incrementally. Allows compression of data that cannot fit into memory all at once.
There are no public attributes.
compressobj() (constructor)
zlib.compressobj(level: Int32 = -1, method: Int32 = zlib.Z_DEFLATED, wbits: Int32 = MAX_WBITS, memLevel: Int32 = zlib.DEF_MEM_LEVEL, strategy: Int32 = zlib.Z_DEFAULT_STRATEGY) raises -> zlib.Compress
Return a compression object whose compress()
method takes a Span[UInt8]
and returns compressed data for a portion of the data.
The returned object also has flush()
methods. This allows for incremental compression; it can be more efficient when compressing very large amounts of data.
Parameters:
level
: Compression level (same ascompress()
)method
: The compression algorithm. Currently, onlyDEFLATED
is supportedwbits
: Window bits parameter (same ascompress()
)memLevel
: Controls memory used for compression. Valid values 1-9. Higher values use more memory but are fasterstrategy
: Compression strategy:Z_DEFAULT_STRATEGY
,Z_FILTERED
,Z_HUFFMAN_ONLY
,Z_RLE
,Z_FIXED
Returns: A Compress
object for incremental compression
Raises: Error
if compression initialization fails or invalid parameters provided
compress()
zlib.Compress.compress(mut self, data: Span[UInt8]) raises -> List[UInt8]
Compress data, returning a List[UInt8]
containing compressed data for at least part of the data in data. This data should be concatenated to the output produced by any preceding calls to the compress()
method. Some input may be kept in internal buffers for later processing.
Parameters:
data
: Data to compress
Returns: Compressed data as List[UInt8]
. May be empty if input data is buffered internally
Raises: Error
if compression fails or flush()
has already been called
flush()
zlib.Compress.flush(mut self) raises -> List[UInt8]
Finish the compression process and return a List[UInt8]
containing any remaining compressed data. This method finishes the compression of any data that might remain in the internal buffers and returns the final compressed data. After calling flush()
, the compressor object cannot be used again.
Returns: Final compressed data as List[UInt8]
Raises: Error
if compression finalization fails
Decompress Object
A decompression object for decompressing data incrementally. Allows decompression of data that cannot fit into memory all at once. Contains attributes unused_data
, unconsumed_tail
, and eof
that provide information about the decompression process.
Attributes:
Decompress.unused_data: List[UInt8]
- Contains any bytes past the end of the compressed data. Always empty until the entire compressed stream has been decompressedDecompress.unconsumed_tail: List[UInt8]
- Contains any data that was not consumed by the lastdecompress()
call because it exceeded the limit on the uncompressed dataDecompress.eof: Bool
- True if the end-of-stream marker has been reached
decompressobj() (constructor)
zlib.decompressobj(wbits: Int32 = zlib.MAX_WBITS) raises -> zlib.Decompress
Return a decompression object whose decompress()
method takes a bytes object and returns decompressed data for a portion of the data.
The returned object also has decompress()
and flush()
methods, and unused_data
, unconsumed_tail
, and eof
attributes. This allows for incremental decompression when decompressing very large amounts of data.
Parameters:
wbits
: Window bits parameter (same asdecompress()
)
Returns: A Decompress
object for incremental decompression
Raises: Error
if decompression initialization fails or invalid parameters provided
decompress()
zlib.Decompress.decompress(mut self, data: Span[UInt8], max_length: Int = -1) raises -> List[UInt8]
Decompress data, returning a bytes object containing uncompressed data corresponding to at least part of the data in data. This data should be concatenated to the output produced by any preceding calls to the decompress()
method.
Parameters:
data
: Compressed data to decompressmax_length
: Maximum number of bytes to return. If negative (default), there is no limit
Returns: Decompressed data as List[UInt8]
. May be empty if input data is buffered internally
Raises: Error
if the data is invalid, corrupted, or incomplete
flush()
zlib.Decompress.flush(mut self) raises -> List[UInt8]
Return a bytes object containing any remaining uncompressed data. This method is primarily used to force any remaining uncompressed data in internal buffers to be returned.
Returns: Any remaining decompressed data as List[UInt8]
Raises: Error
if there are issues finalizing the decompression
Constants
The library provides Python-compatible constants:
- Compression Levels:
Z_NO_COMPRESSION
,Z_BEST_SPEED
,Z_BEST_COMPRESSION
,Z_DEFAULT_COMPRESSION
- Compression Methods:
DEFLATED
,Z_DEFLATED
- Compression Strategies:
Z_DEFAULT_STRATEGY
,Z_FILTERED
,Z_HUFFMAN_ONLY
,Z_RLE
,Z_FIXED
- Flush Modes:
Z_NO_FLUSH
,Z_PARTIAL_FLUSH
,Z_SYNC_FLUSH
,Z_FULL_FLUSH
,Z_FINISH
,Z_BLOCK
,Z_TREES
- Other:
MAX_WBITS
,DEF_BUF_SIZE
,DEF_MEM_LEVEL
,ZLIB_VERSION
,ZLIB_RUNTIME_VERSION
Examples
Basic Compression/Decompression
import zlib
fn main() raises:
text = "Hello, World! This is a longer text that will benefit from compression."
data = text.as_bytes()
# Compress data
compressed = zlib.compress(data, level=6)
print("Original size:", len(data), "Compressed size:", len(compressed))
# Decompress data
decompressed = zlib.decompress(compressed)
print("Decompression successful:", String(bytes=decompressed))
Streaming Compression
import zlib
fn main() raises:
# Create compressor
compressor = zlib.compressobj(level=9)
# Compress data in chunks
chunk1 = compressor.compress("First chunk of data ".as_bytes())
chunk2 = compressor.compress("Second chunk of data ".as_bytes())
chunk3 = compressor.compress("Final chunk of data".as_bytes())
final = compressor.flush()
# Combine results
compressed = chunk1 + chunk2 + chunk3 + final
print(String(bytes=zlib.decompress(compressed)))
Format-Specific Compression
import zlib
fn main() raises:
data = "Test data for different formats".as_bytes()
# Raw DEFLATE format (no header/trailer)
raw_compressed = zlib.compress(data, wbits=-15)
# Gzip format
gzip_compressed = zlib.compress(data, wbits=16+15)
# Standard zlib format (default)
zlib_compressed = zlib.compress(data, wbits=15)
Checksum Calculations
import zlib
fn main() raises:
data = "Data for checksum calculation".as_bytes()
# Calculate CRC32
crc = zlib.crc32(data)
print("CRC32:", crc)
# Calculate Adler32
adler = zlib.adler32(data)
print("Adler32:", adler)
# Running checksums
part1 = "First part ".as_bytes()
part2 = "second part".as_bytes()
crc1 = zlib.crc32(part1)
crc_total = zlib.crc32(part2, crc1) # Running checksum
print("CRC32 running total:", crc_total)
Performance
This library leverages the optimized zlib C library for compression operations while providing pure Mojo implementations for checksum functions.
License
The license of this project is MIT. Check LICENSE for more details.
Install
pixiaddmojo-zlib
micromambainstall-c https://repo.prefix.dev/mojo-communitymojo-zlib
Version
Platforms
Last published
filename | version | build | Created | size | Architecture | Downloads | |
---|---|---|---|---|---|---|---|
mojo-zlib-0.1.7-he8cfe8b_0.conda | 0.1.7 | he8cfe8b_0 (0) | 7 days ago | 972.88 KB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.7-hb0f4dca_0.conda | 0.1.7 | hb0f4dca_0 (0) | 7 days ago | 972.89 KB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.7-h60d57d3_0.conda | 0.1.7 | h60d57d3_0 (0) | 7 days ago | 972.84 KB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.6-h60d57d3_0.conda | 0.1.6 | h60d57d3_0 (0) | a month ago | 986.89 KB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.6-hb0f4dca_0.conda | 0.1.6 | hb0f4dca_0 (0) | a month ago | 986.89 KB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.6-he8cfe8b_0.conda | 0.1.6 | he8cfe8b_0 (0) | a month ago | 986.89 KB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.5-he8cfe8b_0.conda | 0.1.5 | he8cfe8b_0 (0) | a month ago | 1.03 MB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.5-h60d57d3_0.conda | 0.1.5 | h60d57d3_0 (0) | a month ago | 1.03 MB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.5-hb0f4dca_0.conda | 0.1.5 | hb0f4dca_0 (0) | a month ago | 1.03 MB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.3-hb0f4dca_0.conda | 0.1.3 | hb0f4dca_0 (0) | a month ago | 1.03 MB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.2-hb0f4dca_0.conda | 0.1.2 | hb0f4dca_0 (0) | a month ago | 1.03 MB | N/A | ||
Dependencies: | |||||||
mojo-zlib-0.1.1-hb0f4dca_0.conda | 0.1.1 | hb0f4dca_0 (0) | a month ago | 1.02 MB | N/A | ||
Dependencies: |