Fix mlx-ci.yml torch install gap + finetune-last-n-layers FakeModel stubs#700
Merged
danielhanchen merged 3 commits intoJun 10, 2026
Merged
Conversation
…tubs Two unrelated cleanups surfaced while validating PR #697: 1. .github/workflows/mlx-ci.yml fails on every scheduled run. pyproject gates `torch` off on darwin+arm64 (mlx is the native path on Apple Silicon), so `pip install -e .[mlx]` on macos-14 does not pull torch. But tests/mlx_simulation/__init__.py:52 unconditionally `import torch` so the autouse shim that powers the test suite can monkeypatch torch.Tensor with MLX-only methods. Collection of tests/test_mlx_torch_shim_smoke.py blows up with `ModuleNotFoundError: No module named 'torch'` before any test runs. Reproduces: most recent scheduled cron run on main (run id 26499388502) failed in 52s with exactly that error at collection time. Fix: install the CPU torch wheel explicitly before `.[mlx]`, matching the local-dev install order. Three-line patch. 2. tests/test_mlx_finetune_last_n_layers.py FakeModel is incomplete. `unsloth_zoo/mlx/loader.py::FastMLXModel.get_peft_model` prints a "LoRA applied -- N trainable params (P% of T total)" stats line at the tail of the function via `mlx.utils.tree_flatten(model.trainable_parameters())` plus `model.parameters()`. The synthetic FakeModel in this test never implemented those two methods, so the assertion targeting `linear_to_lora_layers`'s captured `num_layers` is never reached -- the test crashes at `loader.py:3712` with `AttributeError: 'FakeModel' object has no attribute 'trainable_parameters'`. The test exercises a parameter passthrough; the stats line is incidental. Fix: stub `trainable_parameters` and `parameters` on FakeModel to return empty dicts so tree_flatten yields zero entries and the stats line prints "0.00% of 0 total" without raising. Verified locally: python -m pytest tests/test_mlx_finetune_last_n_layers.py -v 2 passed python -m pytest tests/test_mlx_*.py -q 180 passed, 1 skipped python -m pytest tests/ -q --ignore=tests/security 1017 passed, 50 skipped
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the FakeModel class in tests/test_mlx_finetune_last_n_layers.py by adding mock trainable_parameters and parameters methods that return empty dictionaries. This prevents AttributeError exceptions when get_peft_model attempts to print LoRA parameter statistics during testing. There are no review comments to address.
pyproject.toml line 26 declares torch>=2.4.0,<2.13.0 ; (sys_platform != 'darwin' or platform_machine != 'arm64') The mlx-ci.yml install used <2.11.0 by mistake. Bring the workflow into sync so it does not block torch 2.11 / 2.12 which the project itself supports.
danielhanchen
added a commit
that referenced
this pull request
Jun 10, 2026
* fix(mlx): preserve vlm merged config saves * fix(mlx): dequantize qlora merged saves * fix(mlx): materialize tied lm head exports * fix(mlx): forward gguf export options * fix(mlx): normalize vlm mmproj export tensors * fix(mlx): rewrite VLM GGUF tensor candidates Prefer HF vision aliases when inverting mlx-vlm sanitizers, while keeping same-name tensor rewrites as a fallback. This preserves Gemma3 patch embedding layout fixes and avoids stopping early on Qwen-family MLX vision_tower names. * fix(mlx): replay model sanitizers for VLM GGUF Use loaded VLM model instances when replaying mlx-vlm sanitizers for GGUF tensor rewrites, while keeping the config-derived class pipeline as a fallback. This covers models whose top-level sanitizer delegates through submodules, such as GLM-OCR. * fix(mlx): extract vlm config objects for saves * fix(gguf): run package converters beside conversion * fix(mlx): align gguf nextn metadata with tensors * fix(mlx): repair degraded VLM processors * fix(mlx): preserve source save sidecars * test(mlx): cover save export regressions * fix(mlx): harden save export review cases * fix(mlx): address save export review cases * revert(mlx): drop tied lm head materialization This reverts commit d8cdf89. Also removes the related regression coverage and later helper hardening now that bug-4 is out of scope. * Fix non-object sidecar JSON handling and add edge case tests for PR #697 - unsloth_zoo/mlx/loader.py: _read_json_file returned lists/strings/None for sidecars holding valid but non-object JSON, which crashed _repair_degraded_vlm_processor with AttributeError on .get(). Coerce non-dict payloads to {} to match the documented contract. - tests/test_mlx_save_export_edge_cases.py: 71 tests covering branches the regression suite does not reach: rank-3/5D/1D tensor matching, the _MlxVlmSanitizeProxy two-arg sanitize path, the real shard rewrite loop with safetensors files plus index.json remapping, duplicate-name guards, NextN decrement and skip branches, sidecar filesystem edges (unicode, symlinks, permissions, src equals dst), six-thread concurrent GGUF exports over the patcher env lock, a user-set UNSLOTH_LLAMA_CPP_SCRIPTS_DIR override, and monolith vs package converter placement. - tests/mlx_simulation/mlx_stub.py: save_safetensors now normalizes tensors to contiguous; real MLX has no contiguity constraint, so the shim should not reject the rewrite loop's transposed views. All 273 MLX tests pass on Linux via the simulation shim across Python 3.10-3.13 and transformers 4.51.3, 4.57.6 and 5.5.0. The only failure is the pre-existing FakeModel one in test_mlx_finetune_last_n_layers.py, tracked in #700. * Tighten comments and docstrings for PR #697 Comment and docstring only pass over the PR diff, applying the repo brevity policy: collapse multi line blocks, drop comments that restate the code, and keep load bearing rationale. Scope is restricted to lines this PR adds so the pass cannot conflict with the comment tightening already merged on main. - unsloth_zoo/llama_cpp.py: 3 line placement comment down to 2. - unsloth_zoo/mlx/utils.py: file backed safetensors gotcha down to 2 lines, dropped a comment restating _copy_source_sidecars. - unsloth_zoo/mlx/loader.py: _repair_degraded_vlm_processor docstring 8 to 5 lines keeping the why. - tests/test_mlx_save_export_regressions.py: module docstring 5 to 3 lines. - tests/test_mlx_save_export_edge_cases.py: 14 three line section banners collapsed to one line each, two line docstrings to one, removed docstrings and comments that restate test names or adjacent assertions. Net 45 lines removed. Every file verified code identical to the previous commit by AST comparison with docstrings stripped; comment_tools.py check passes for all files except the one intentional docstring deletion, which the AST comparison covers. Full MLX suite: 273 passed, 1 skipped, plus the known pre existing failure tracked in #700. * Address review feedback for PR #697 All four review comments were valid; each is fixed with a regression test in tests/test_mlx_save_export_edge_cases.py: - push_to_hub_gguf: first_conversion moved after token and private so pre-existing positional callers keep their meaning; it was inserted before token and would have bound a positionally passed HF token as the intermediate GGUF dtype. - _get_mlx_vlm_model_sanitize_pipelines: wrappers without a top level sanitize no longer return zero pipelines; submodule sanitizers such as vision_tower.sanitize now produce replay pipelines on their own. - _resolve_mlx_vlm_processor_class: probes the MODEL_REMAPPING target package as well, so aliased model types resolve their custom processor classes. - _build_vlm_image_processor_from_config: falls back to the mlx-vlm processing module for image processor classes that do not exist in transformers, before the AutoImageProcessor retry. Full MLX suite: 278 passed, 1 skipped. Full repo suite: 1114 passed; the two failures are pre-existing on the branch (FakeModel stub tracked in #700, and the PyPI version sync test which lags because this branch predates the latest release and resolves on merge). --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>
…all-and-finetune-fakemodel # Conflicts: # tests/test_mlx_finetune_last_n_layers.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two unrelated cleanups surfaced while validating #697.
1. `.github/workflows/mlx-ci.yml` fails on every scheduled run
`pyproject.toml` gates `torch` off on darwin+arm64 (mlx is the native path on Apple Silicon), so `pip install -e .[mlx]` on macos-14 does not pull torch. But `tests/mlx_simulation/init.py:52` unconditionally does `import torch` -- the autouse shim that powers the test suite needs it to monkeypatch `torch.Tensor` with MLX-only methods (`.astype`, `.expand_dims`, `.at[]`).
Collection of `tests/test_mlx_torch_shim_smoke.py` blows up with `ModuleNotFoundError: No module named 'torch'` before any test runs.
Reproduces: most recent scheduled cron run on `main` (run id 26499388502) failed in 52s with exactly that error at collection time.
Fix: install the CPU torch wheel explicitly before `.[mlx]`, matching the local-dev install order.
2. `tests/test_mlx_finetune_last_n_layers.py` FakeModel is incomplete
`unsloth_zoo/mlx/loader.py::FastMLXModel.get_peft_model` prints a "LoRA applied -- N trainable params (P% of T total)" stats line at the tail of the function via `mlx.utils.tree_flatten(model.trainable_parameters())` plus `model.parameters()`. The synthetic FakeModel in this test never implemented those two methods, so the assertion targeting `linear_to_lora_layers`'s captured `num_layers` is never reached -- the test crashes at `loader.py:3712` with `AttributeError: 'FakeModel' object has no attribute 'trainable_parameters'`.
The test exercises a parameter passthrough; the stats line is incidental. Fix: stub `trainable_parameters` and `parameters` on FakeModel to return empty dicts so `tree_flatten` yields zero entries and the stats line prints "0.00% of 0 total" without raising.
Verification
```
python -m pytest tests/test_mlx_finetune_last_n_layers.py -v
2 passed
python -m pytest tests/test_mlx_*.py -q
180 passed, 1 skipped
python -m pytest tests/ -q --ignore=tests/security
1017 passed, 50 skipped
```
Both fixes are independent of each other and of #697; bundled here because both surfaced in the same audit pass.