Fix mlx-ci.yml torch install gap + finetune-last-n-layers FakeModel stubs by danielhanchen · Pull Request #700 · unslothai/unsloth-zoo

danielhanchen · 2026-05-27T14:44:48Z

Two unrelated cleanups surfaced while validating #697.

1. `.github/workflows/mlx-ci.yml` fails on every scheduled run

`pyproject.toml` gates `torch` off on darwin+arm64 (mlx is the native path on Apple Silicon), so `pip install -e .[mlx]` on macos-14 does not pull torch. But `tests/mlx_simulation/init.py:52` unconditionally does `import torch` -- the autouse shim that powers the test suite needs it to monkeypatch `torch.Tensor` with MLX-only methods (`.astype`, `.expand_dims`, `.at[]`).

Collection of `tests/test_mlx_torch_shim_smoke.py` blows up with `ModuleNotFoundError: No module named 'torch'` before any test runs.

Reproduces: most recent scheduled cron run on `main` (run id 26499388502) failed in 52s with exactly that error at collection time.

Fix: install the CPU torch wheel explicitly before `.[mlx]`, matching the local-dev install order.

2. `tests/test_mlx_finetune_last_n_layers.py` FakeModel is incomplete

`unsloth_zoo/mlx/loader.py::FastMLXModel.get_peft_model` prints a "LoRA applied -- N trainable params (P% of T total)" stats line at the tail of the function via `mlx.utils.tree_flatten(model.trainable_parameters())` plus `model.parameters()`. The synthetic FakeModel in this test never implemented those two methods, so the assertion targeting `linear_to_lora_layers`'s captured `num_layers` is never reached -- the test crashes at `loader.py:3712` with `AttributeError: 'FakeModel' object has no attribute 'trainable_parameters'`.

The test exercises a parameter passthrough; the stats line is incidental. Fix: stub `trainable_parameters` and `parameters` on FakeModel to return empty dicts so `tree_flatten` yields zero entries and the stats line prints "0.00% of 0 total" without raising.

Verification

```
python -m pytest tests/test_mlx_finetune_last_n_layers.py -v
2 passed
python -m pytest tests/test_mlx_*.py -q
180 passed, 1 skipped
python -m pytest tests/ -q --ignore=tests/security
1017 passed, 50 skipped
```

Both fixes are independent of each other and of #697; bundled here because both surfaced in the same audit pass.

…tubs Two unrelated cleanups surfaced while validating PR #697: 1. .github/workflows/mlx-ci.yml fails on every scheduled run. pyproject gates `torch` off on darwin+arm64 (mlx is the native path on Apple Silicon), so `pip install -e .[mlx]` on macos-14 does not pull torch. But tests/mlx_simulation/__init__.py:52 unconditionally `import torch` so the autouse shim that powers the test suite can monkeypatch torch.Tensor with MLX-only methods. Collection of tests/test_mlx_torch_shim_smoke.py blows up with `ModuleNotFoundError: No module named 'torch'` before any test runs. Reproduces: most recent scheduled cron run on main (run id 26499388502) failed in 52s with exactly that error at collection time. Fix: install the CPU torch wheel explicitly before `.[mlx]`, matching the local-dev install order. Three-line patch. 2. tests/test_mlx_finetune_last_n_layers.py FakeModel is incomplete. `unsloth_zoo/mlx/loader.py::FastMLXModel.get_peft_model` prints a "LoRA applied -- N trainable params (P% of T total)" stats line at the tail of the function via `mlx.utils.tree_flatten(model.trainable_parameters())` plus `model.parameters()`. The synthetic FakeModel in this test never implemented those two methods, so the assertion targeting `linear_to_lora_layers`'s captured `num_layers` is never reached -- the test crashes at `loader.py:3712` with `AttributeError: 'FakeModel' object has no attribute 'trainable_parameters'`. The test exercises a parameter passthrough; the stats line is incidental. Fix: stub `trainable_parameters` and `parameters` on FakeModel to return empty dicts so tree_flatten yields zero entries and the stats line prints "0.00% of 0 total" without raising. Verified locally: python -m pytest tests/test_mlx_finetune_last_n_layers.py -v 2 passed python -m pytest tests/test_mlx_*.py -q 180 passed, 1 skipped python -m pytest tests/ -q --ignore=tests/security 1017 passed, 50 skipped

gemini-code-assist

Code Review

This pull request updates the FakeModel class in tests/test_mlx_finetune_last_n_layers.py by adding mock trainable_parameters and parameters methods that return empty dictionaries. This prevents AttributeError exceptions when get_peft_model attempts to print LoRA parameter statistics during testing. There are no review comments to address.

pyproject.toml line 26 declares torch>=2.4.0,<2.13.0 ; (sys_platform != 'darwin' or platform_machine != 'arm64') The mlx-ci.yml install used <2.11.0 by mistake. Bring the workflow into sync so it does not block torch 2.11 / 2.12 which the project itself supports.

* fix(mlx): preserve vlm merged config saves * fix(mlx): dequantize qlora merged saves * fix(mlx): materialize tied lm head exports * fix(mlx): forward gguf export options * fix(mlx): normalize vlm mmproj export tensors * fix(mlx): rewrite VLM GGUF tensor candidates Prefer HF vision aliases when inverting mlx-vlm sanitizers, while keeping same-name tensor rewrites as a fallback. This preserves Gemma3 patch embedding layout fixes and avoids stopping early on Qwen-family MLX vision_tower names. * fix(mlx): replay model sanitizers for VLM GGUF Use loaded VLM model instances when replaying mlx-vlm sanitizers for GGUF tensor rewrites, while keeping the config-derived class pipeline as a fallback. This covers models whose top-level sanitizer delegates through submodules, such as GLM-OCR. * fix(mlx): extract vlm config objects for saves * fix(gguf): run package converters beside conversion * fix(mlx): align gguf nextn metadata with tensors * fix(mlx): repair degraded VLM processors * fix(mlx): preserve source save sidecars * test(mlx): cover save export regressions * fix(mlx): harden save export review cases * fix(mlx): address save export review cases * revert(mlx): drop tied lm head materialization This reverts commit d8cdf89. Also removes the related regression coverage and later helper hardening now that bug-4 is out of scope. * Fix non-object sidecar JSON handling and add edge case tests for PR #697 - unsloth_zoo/mlx/loader.py: _read_json_file returned lists/strings/None for sidecars holding valid but non-object JSON, which crashed _repair_degraded_vlm_processor with AttributeError on .get(). Coerce non-dict payloads to {} to match the documented contract. - tests/test_mlx_save_export_edge_cases.py: 71 tests covering branches the regression suite does not reach: rank-3/5D/1D tensor matching, the _MlxVlmSanitizeProxy two-arg sanitize path, the real shard rewrite loop with safetensors files plus index.json remapping, duplicate-name guards, NextN decrement and skip branches, sidecar filesystem edges (unicode, symlinks, permissions, src equals dst), six-thread concurrent GGUF exports over the patcher env lock, a user-set UNSLOTH_LLAMA_CPP_SCRIPTS_DIR override, and monolith vs package converter placement. - tests/mlx_simulation/mlx_stub.py: save_safetensors now normalizes tensors to contiguous; real MLX has no contiguity constraint, so the shim should not reject the rewrite loop's transposed views. All 273 MLX tests pass on Linux via the simulation shim across Python 3.10-3.13 and transformers 4.51.3, 4.57.6 and 5.5.0. The only failure is the pre-existing FakeModel one in test_mlx_finetune_last_n_layers.py, tracked in #700. * Tighten comments and docstrings for PR #697 Comment and docstring only pass over the PR diff, applying the repo brevity policy: collapse multi line blocks, drop comments that restate the code, and keep load bearing rationale. Scope is restricted to lines this PR adds so the pass cannot conflict with the comment tightening already merged on main. - unsloth_zoo/llama_cpp.py: 3 line placement comment down to 2. - unsloth_zoo/mlx/utils.py: file backed safetensors gotcha down to 2 lines, dropped a comment restating _copy_source_sidecars. - unsloth_zoo/mlx/loader.py: _repair_degraded_vlm_processor docstring 8 to 5 lines keeping the why. - tests/test_mlx_save_export_regressions.py: module docstring 5 to 3 lines. - tests/test_mlx_save_export_edge_cases.py: 14 three line section banners collapsed to one line each, two line docstrings to one, removed docstrings and comments that restate test names or adjacent assertions. Net 45 lines removed. Every file verified code identical to the previous commit by AST comparison with docstrings stripped; comment_tools.py check passes for all files except the one intentional docstring deletion, which the AST comparison covers. Full MLX suite: 273 passed, 1 skipped, plus the known pre existing failure tracked in #700. * Address review feedback for PR #697 All four review comments were valid; each is fixed with a regression test in tests/test_mlx_save_export_edge_cases.py: - push_to_hub_gguf: first_conversion moved after token and private so pre-existing positional callers keep their meaning; it was inserted before token and would have bound a positionally passed HF token as the intermediate GGUF dtype. - _get_mlx_vlm_model_sanitize_pipelines: wrappers without a top level sanitize no longer return zero pipelines; submodule sanitizers such as vision_tower.sanitize now produce replay pipelines on their own. - _resolve_mlx_vlm_processor_class: probes the MODEL_REMAPPING target package as well, so aliased model types resolve their custom processor classes. - _build_vlm_image_processor_from_config: falls back to the mlx-vlm processing module for image processor classes that do not exist in transformers, before the AutoImageProcessor retry. Full MLX suite: 278 passed, 1 skipped. Full repo suite: 1114 passed; the two failures are pre-existing on the branch (FakeModel stub tracked in #700, and the PyPI version sync test which lags because this branch predates the latest release and resolves on merge). --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

…all-and-finetune-fakemodel # Conflicts: # tests/test_mlx_finetune_last_n_layers.py

gemini-code-assist Bot reviewed May 27, 2026

View reviewed changes

danielhanchen mentioned this pull request Jun 10, 2026

Fix MLX save and GGUF export parity issues #697

Merged

Merge remote-tracking branch 'origin/main' into fix/mlx-ci-torch-inst…

0ece2a1

…all-and-finetune-fakemodel # Conflicts: # tests/test_mlx_finetune_last_n_layers.py

danielhanchen merged commit d676285 into main Jun 10, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mlx-ci.yml torch install gap + finetune-last-n-layers FakeModel stubs#700

Fix mlx-ci.yml torch install gap + finetune-last-n-layers FakeModel stubs#700
danielhanchen merged 3 commits into
mainfrom
fix/mlx-ci-torch-install-and-finetune-fakemodel

danielhanchen commented May 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danielhanchen commented May 27, 2026

1. `.github/workflows/mlx-ci.yml` fails on every scheduled run

2. `tests/test_mlx_finetune_last_n_layers.py` FakeModel is incomplete

Verification

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants