Skip to content

feat: Add LZ4 compression support to .apr model format #146

@noahgift

Description

@noahgift

Summary

Add optional LZ4 compression to the .apr model format to reduce storage size and improve transfer speeds.

Motivation

  • .apr model files can be large (MBs to GBs)
  • Float32/float16 tensors compress 2-10x with LZ4
  • Faster uploads to Hugging Face Hub
  • Reduced disk usage for model storage
  • Synergy with trueno-zram for runtime decompression

Proposed Changes

  1. Add Compression enum to AprWriter/AprReader
  2. Support LZ4 compression (via trueno's kernel)
  3. Backward compatible: uncompressed files still work
  4. Magic bytes indicate compression type

API Design

// Writing compressed model
let writer = AprWriter::new()
    .with_compression(Compression::Lz4)
    .create("model.apr")?;

// Reading auto-detects compression
let reader = AprReader::open("model.apr")?;

Integration

  • trueno-gpu: GPU batch compression for large models
  • trueno-zram: CPU SIMD compression fallback
  • batuta: Orchestration for model export/import

Acceptance Criteria

  • Compression enum: None, Lz4, Zstd
  • AprWriter::with_compression() method
  • AprReader auto-detects compression from header
  • Backward compatible with uncompressed .apr files
  • Benchmarks showing compression ratio and speed
  • Tests for roundtrip compression

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions