Skip to content

Fix mlx-ci.yml torch install gap + finetune-last-n-layers FakeModel stubs#700

Merged
danielhanchen merged 3 commits into
mainfrom
fix/mlx-ci-torch-install-and-finetune-fakemodel
Jun 10, 2026
Merged

Fix mlx-ci.yml torch install gap + finetune-last-n-layers FakeModel stubs#700
danielhanchen merged 3 commits into
mainfrom
fix/mlx-ci-torch-install-and-finetune-fakemodel

Conversation

@danielhanchen

Copy link
Copy Markdown
Member

Two unrelated cleanups surfaced while validating #697.

1. `.github/workflows/mlx-ci.yml` fails on every scheduled run

`pyproject.toml` gates `torch` off on darwin+arm64 (mlx is the native path on Apple Silicon), so `pip install -e .[mlx]` on macos-14 does not pull torch. But `tests/mlx_simulation/init.py:52` unconditionally does `import torch` -- the autouse shim that powers the test suite needs it to monkeypatch `torch.Tensor` with MLX-only methods (`.astype`, `.expand_dims`, `.at[]`).

Collection of `tests/test_mlx_torch_shim_smoke.py` blows up with `ModuleNotFoundError: No module named 'torch'` before any test runs.

Reproduces: most recent scheduled cron run on `main` (run id 26499388502) failed in 52s with exactly that error at collection time.

Fix: install the CPU torch wheel explicitly before `.[mlx]`, matching the local-dev install order.

2. `tests/test_mlx_finetune_last_n_layers.py` FakeModel is incomplete

`unsloth_zoo/mlx/loader.py::FastMLXModel.get_peft_model` prints a "LoRA applied -- N trainable params (P% of T total)" stats line at the tail of the function via `mlx.utils.tree_flatten(model.trainable_parameters())` plus `model.parameters()`. The synthetic FakeModel in this test never implemented those two methods, so the assertion targeting `linear_to_lora_layers`'s captured `num_layers` is never reached -- the test crashes at `loader.py:3712` with `AttributeError: 'FakeModel' object has no attribute 'trainable_parameters'`.

The test exercises a parameter passthrough; the stats line is incidental. Fix: stub `trainable_parameters` and `parameters` on FakeModel to return empty dicts so `tree_flatten` yields zero entries and the stats line prints "0.00% of 0 total" without raising.

Verification

```
python -m pytest tests/test_mlx_finetune_last_n_layers.py -v
2 passed
python -m pytest tests/test_mlx_*.py -q
180 passed, 1 skipped
python -m pytest tests/ -q --ignore=tests/security
1017 passed, 50 skipped
```

Both fixes are independent of each other and of #697; bundled here because both surfaced in the same audit pass.

…tubs

Two unrelated cleanups surfaced while validating PR #697:

1. .github/workflows/mlx-ci.yml fails on every scheduled run.

   pyproject gates `torch` off on darwin+arm64 (mlx is the native path
   on Apple Silicon), so `pip install -e .[mlx]` on macos-14 does not
   pull torch. But tests/mlx_simulation/__init__.py:52 unconditionally
   `import torch` so the autouse shim that powers the test suite can
   monkeypatch torch.Tensor with MLX-only methods. Collection of
   tests/test_mlx_torch_shim_smoke.py blows up with
   `ModuleNotFoundError: No module named 'torch'` before any test runs.

   Reproduces: most recent scheduled cron run on main (run id
   26499388502) failed in 52s with exactly that error at collection
   time. Fix: install the CPU torch wheel explicitly before `.[mlx]`,
   matching the local-dev install order. Three-line patch.

2. tests/test_mlx_finetune_last_n_layers.py FakeModel is incomplete.

   `unsloth_zoo/mlx/loader.py::FastMLXModel.get_peft_model` prints a
   "LoRA applied -- N trainable params (P% of T total)" stats line at
   the tail of the function via
   `mlx.utils.tree_flatten(model.trainable_parameters())` plus
   `model.parameters()`. The synthetic FakeModel in this test never
   implemented those two methods, so the assertion targeting
   `linear_to_lora_layers`'s captured `num_layers` is never reached --
   the test crashes at `loader.py:3712` with
   `AttributeError: 'FakeModel' object has no attribute
   'trainable_parameters'`.

   The test exercises a parameter passthrough; the stats line is
   incidental. Fix: stub `trainable_parameters` and `parameters` on
   FakeModel to return empty dicts so tree_flatten yields zero entries
   and the stats line prints "0.00% of 0 total" without raising.

Verified locally:
  python -m pytest tests/test_mlx_finetune_last_n_layers.py -v
    2 passed
  python -m pytest tests/test_mlx_*.py -q
    180 passed, 1 skipped
  python -m pytest tests/ -q --ignore=tests/security
    1017 passed, 50 skipped

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the FakeModel class in tests/test_mlx_finetune_last_n_layers.py by adding mock trainable_parameters and parameters methods that return empty dictionaries. This prevents AttributeError exceptions when get_peft_model attempts to print LoRA parameter statistics during testing. There are no review comments to address.

pyproject.toml line 26 declares
  torch>=2.4.0,<2.13.0 ; (sys_platform != 'darwin' or platform_machine != 'arm64')

The mlx-ci.yml install used <2.11.0 by mistake. Bring the workflow into
sync so it does not block torch 2.11 / 2.12 which the project itself
supports.
danielhanchen added a commit that referenced this pull request Jun 10, 2026
* fix(mlx): preserve vlm merged config saves

* fix(mlx): dequantize qlora merged saves

* fix(mlx): materialize tied lm head exports

* fix(mlx): forward gguf export options

* fix(mlx): normalize vlm mmproj export tensors

* fix(mlx): rewrite VLM GGUF tensor candidates

Prefer HF vision aliases when inverting mlx-vlm sanitizers, while keeping same-name tensor rewrites as a fallback. This preserves Gemma3 patch embedding layout fixes and avoids stopping early on Qwen-family MLX vision_tower names.

* fix(mlx): replay model sanitizers for VLM GGUF

Use loaded VLM model instances when replaying mlx-vlm sanitizers for GGUF tensor rewrites, while keeping the config-derived class pipeline as a fallback. This covers models whose top-level sanitizer delegates through submodules, such as GLM-OCR.

* fix(mlx): extract vlm config objects for saves

* fix(gguf): run package converters beside conversion

* fix(mlx): align gguf nextn metadata with tensors

* fix(mlx): repair degraded VLM processors

* fix(mlx): preserve source save sidecars

* test(mlx): cover save export regressions

* fix(mlx): harden save export review cases

* fix(mlx): address save export review cases

* revert(mlx): drop tied lm head materialization

This reverts commit d8cdf89.

Also removes the related regression coverage and later helper hardening now that bug-4 is out of scope.

* Fix non-object sidecar JSON handling and add edge case tests for PR #697

- unsloth_zoo/mlx/loader.py: _read_json_file returned lists/strings/None for
  sidecars holding valid but non-object JSON, which crashed
  _repair_degraded_vlm_processor with AttributeError on .get(). Coerce
  non-dict payloads to {} to match the documented contract.
- tests/test_mlx_save_export_edge_cases.py: 71 tests covering branches the
  regression suite does not reach: rank-3/5D/1D tensor matching, the
  _MlxVlmSanitizeProxy two-arg sanitize path, the real shard rewrite loop
  with safetensors files plus index.json remapping, duplicate-name guards,
  NextN decrement and skip branches, sidecar filesystem edges (unicode,
  symlinks, permissions, src equals dst), six-thread concurrent GGUF
  exports over the patcher env lock, a user-set
  UNSLOTH_LLAMA_CPP_SCRIPTS_DIR override, and monolith vs package
  converter placement.
- tests/mlx_simulation/mlx_stub.py: save_safetensors now normalizes tensors
  to contiguous; real MLX has no contiguity constraint, so the shim should
  not reject the rewrite loop's transposed views.

All 273 MLX tests pass on Linux via the simulation shim across Python
3.10-3.13 and transformers 4.51.3, 4.57.6 and 5.5.0. The only failure is
the pre-existing FakeModel one in test_mlx_finetune_last_n_layers.py,
tracked in #700.

* Tighten comments and docstrings for PR #697

Comment and docstring only pass over the PR diff, applying the repo brevity
policy: collapse multi line blocks, drop comments that restate the code, and
keep load bearing rationale. Scope is restricted to lines this PR adds so the
pass cannot conflict with the comment tightening already merged on main.

- unsloth_zoo/llama_cpp.py: 3 line placement comment down to 2.
- unsloth_zoo/mlx/utils.py: file backed safetensors gotcha down to 2 lines,
  dropped a comment restating _copy_source_sidecars.
- unsloth_zoo/mlx/loader.py: _repair_degraded_vlm_processor docstring 8 to 5
  lines keeping the why.
- tests/test_mlx_save_export_regressions.py: module docstring 5 to 3 lines.
- tests/test_mlx_save_export_edge_cases.py: 14 three line section banners
  collapsed to one line each, two line docstrings to one, removed docstrings
  and comments that restate test names or adjacent assertions.

Net 45 lines removed. Every file verified code identical to the previous
commit by AST comparison with docstrings stripped; comment_tools.py check
passes for all files except the one intentional docstring deletion, which
the AST comparison covers. Full MLX suite: 273 passed, 1 skipped, plus the
known pre existing failure tracked in #700.

* Address review feedback for PR #697

All four review comments were valid; each is fixed with a regression test
in tests/test_mlx_save_export_edge_cases.py:

- push_to_hub_gguf: first_conversion moved after token and private so
  pre-existing positional callers keep their meaning; it was inserted
  before token and would have bound a positionally passed HF token as the
  intermediate GGUF dtype.
- _get_mlx_vlm_model_sanitize_pipelines: wrappers without a top level
  sanitize no longer return zero pipelines; submodule sanitizers such as
  vision_tower.sanitize now produce replay pipelines on their own.
- _resolve_mlx_vlm_processor_class: probes the MODEL_REMAPPING target
  package as well, so aliased model types resolve their custom processor
  classes.
- _build_vlm_image_processor_from_config: falls back to the mlx-vlm
  processing module for image processor classes that do not exist in
  transformers, before the AutoImageProcessor retry.

Full MLX suite: 278 passed, 1 skipped. Full repo suite: 1114 passed; the
two failures are pre-existing on the branch (FakeModel stub tracked in
#700, and the PyPI version sync test which lags because this branch
predates the latest release and resolves on merge).

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
…all-and-finetune-fakemodel

# Conflicts:
#	tests/test_mlx_finetune_last_n_layers.py
@danielhanchen danielhanchen merged commit d676285 into main Jun 10, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants