Skip to content

fix(mlx): bundle native libs and broaden error handling for Apple Silicon#93

Merged
jamiepine merged 1 commit intojamiepine:mainfrom
iJaack:fix/mlx-apple-silicon-binary
Feb 23, 2026
Merged

fix(mlx): bundle native libs and broaden error handling for Apple Silicon#93
jamiepine merged 1 commit intojamiepine:mainfrom
iJaack:fix/mlx-apple-silicon-binary

Conversation

@iJaack
Copy link

@iJaack iJaack commented Feb 18, 2026

Problem

The distributed macOS aarch64 binary (voicebox_aarch64.app.tar.gz) silently falls back to PyTorch CPU instead of using MLX, despite the README advertising 4-5x faster inference on Apple Silicon.

Two root causes:

1. platform_detect.py only catches ImportError

Inside a PyInstaller bundle, MLX loads its Metal shader libraries (.metallib) at import time. When those files are missing from the bundle, Python raises OSError — not ImportError. The original code:

try:
    import mlx
    return "mlx"
except ImportError:   # ← OSError slips through here
    return "pytorch"

This causes a silent fallback to PyTorch on every Apple Silicon machine running the binary.

2. collect_data_files misses native libraries

build_binary.py and voicebox-server.spec used --collect-data / collect_data_files for mlx and mlx_audio. This only copies pure-Python files — native .dylib and .metallib files are excluded. PyInstaller needs --collect-all / collect_all + Analysis(binaries=...) to bundle shared libraries correctly.

Fix

  1. platform_detect.py — catch (ImportError, OSError, RuntimeError) and import mlx.core (forces native lib loading eagerly so any failure is caught here)
  2. build_binary.py — replace --collect-data mlx/mlx_audio with --collect-all
  3. voicebox-server.spec — use collect_all() and pass collected binaries to Analysis(binaries=...)

Testing

Reproduced on macOS 15 (Apple Silicon M4 Mac mini):

  • Before fix: binary logs Backend: pytorch, generation takes 30+ min for a 5s clip
  • After fix (source build with MLX): logs Backend: MLX | GPU: MPS (Apple Silicon), same clip in ~39s (first run) / ~8s (warm)

…icon

The distributed macOS aarch64 binary shipped without MLX acceleration despite
the model and backend code supporting it. Two root causes:

1. **OSError not caught in platform_detect.py**
   PyInstaller bundles isolate the filesystem, so when MLX tries to load its
   Metal shader libraries (.metallib) it raises OSError, not ImportError.
   platform_detect.get_backend_type() only caught ImportError, causing a
   silent fallback to PyTorch even on Apple Silicon hardware.
   Fix: broaden the except clause to (ImportError, OSError, RuntimeError)
   and import mlx.core instead of mlx (forces native lib loading eagerly).

2. **collect_data_files used instead of collect_all for MLX**
   build_binary.py and voicebox-server.spec used --collect-data /
   collect_data_files for mlx and mlx_audio. This copies Python source and
   pure-Python data, but NOT native shared libraries (.dylib, .metallib).
   Fix: switch to --collect-all / collect_all which captures binaries too,
   then pass them to Analysis(binaries=...) in the spec.

Result: macOS Apple Silicon users now get MLX inference (~4-5x faster than
PyTorch CPU), matching the performance documented in the README.
@Gomesproga
Copy link

before and binary count and servers e input in the lag das

@jamiepine jamiepine merged commit e4bb288 into jamiepine:main Feb 23, 2026
@Gomesproga
Copy link

woo , the apply is very good , an terminate and addons corresct the process

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants