Add MiniMax and MiMo v2 Flash models by ronaldmannak · Pull Request #50 · ml-explore/mlx-swift-lm

ronaldmannak · 2026-01-09T04:26:12Z

Proposed changes

This PR includes two ports: MiniMax and MiMo v2 Flash. As with the previous PR, I don’t have enough memory available to run a full verification of the models. However, I reviewed the implementation and checked it against the reference Python code

Checklist

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)

ronaldmannak · 2026-01-15T02:08:15Z

@davidkoski This PR builds without issues using MLX 0.30.2 and is ready for review and to test on a device with more memory than I have.

davidkoski · 2026-01-15T21:32:42Z

Looks like swift-format wants some changes.

davidkoski · 2026-01-15T21:33:07Z

Here is a MiniMax run:

./mlx-run llm-tool eval --download ~/Downloads/huggingface --model mlx-community/MiniMax-M2.1-8bit --prompt "Tell me a story about cheese." --memory-stats --cache-size 40
--- xcodebuild: WARNING: Using the first of multiple matching destinations:
{ platform:macOS, arch:arm64, id:00006032-000809890145801C, name:My Mac }
{ platform:macOS, name:Any Mac }
Loading mlx-community/MiniMax-M2.1-8bit...
Loaded mlx-community/MiniMax-M2.1-8bit
Starting generation ...

Tell me a story about cheese. The user is asking for a story about cheese. This is a creative writing request. I need to:

1. Identify the writing type: Creative story/fiction
2. Determine appropriate style: Engaging narrative, imaginative, with some humor or charm
3. Consider the audience: General reader looking for entertainment
4. Plan structure: A traditional story arc with beginning, middle, and end

I'll write a charming, imaginative story about cheese that incorporates interesting facts and history while being entertaining. I'll aim------
Prompt:     47 tokens, 3.800743 tokens/s, 12.366004s
Generation: 100 tokens, 0.170617 tokens/s, 586.108668s
=======
Memory size: 510027366K
Cache size:  40960K

=======
Starting memory
Peak:   231730M      (242986802992)
Active: 231730M      (242986802992)
Cache:  1K           (1244)

=======
Ending memory
Peak:   231844M      (243106687920)
Active: 231794M      (243054389988)
Cache:  36M          (38120584)

=======
Growth
Peak:   114M         (119884928)
Active: 64M          (67586996)
Cache:  36M          (38119340)

MiMo is still downloading :-)

davidkoski · 2026-01-15T22:13:48Z

MiMo fails like this:

 ./mlx-run llm-tool eval --download ~/Downloads/huggingface --model mlx-community/MiMo-V2-Flash-mlx-8bit --prompt "Tell me a story about cheese." --memory-stats --cache-size 40
--- xcodebuild: WARNING: Using the first of multiple matching destinations:
{ platform:macOS, arch:arm64, id:00006032-000809890145801C, name:My Mac }
{ platform:macOS, name:Any Mac }
Loading mlx-community/MiMo-V2-Flash-mlx-8bit...
Error: Key model.layers.0.self_attn.attention_sink_bias not found in MiMoV2FlashModel.MiMoV2FlashModelInner.MiMoV2FlashDecoderLayer.MiMoV2FlashAttention

It looks like this part from the python code wasn't ported:

        if self.has_sinks:
            self.attention_sink_bias = mx.ones((self.n_heads,))
        else:
            self.attention_sink_bias = None

though self.has_sinks will be True based on the config:

    "add_full_attention_sink_bias": false,
    "add_swa_attention_sink_bias": true,

I think this may be a case where it keeps the default ones and this isn't validated in the parameter load.

We don't have a great way to handle this in mlx-swift. If the parameter isn't named with a leading _ and it has a value then per the validation on the load it has to be present.

Some ways we might do it:

override update(parameters:) and handle this case either by doing the work or calling super and modifying the VerifyUpdate
in the sanitize(weights:) detect that it is missing and add ones
on Module we could add a sort of delegate call to handle missing values and subclasses could override to do something different

Assuming this is correct behavior, this is the first time it has come up.

…ssing parameters - if validation includes .allModelKeysSet then a missing non-optional parameter will throw a validation error - modules have no clean way to fine tune this, e.g. if one particular parameter is ok to be missing - would allow a fix for ml-explore/mlx-swift-lm#50

ronaldmannak · 2026-01-15T23:42:39Z

Looks like swift-format wants some changes.

This I can fix :)

ronaldmannak · 2026-01-16T00:00:27Z

@davidkoski So MiniMax runs fine, but MiMo fails. In hindsight I should've created two separate PRs.

If I understand you correctly, once #336 is merged, we could override updateMissing in MiMoV2Flash like so:

    override func updateMissing(
        parameter: String,
        verify: VerifyUpdate,
        path: [String],
        modulePath: [String]
    ) throws {
        if parameter == "attention_sink_bias", hasSinks {
            // Weight file omits it; keep the default `ones` already constructed.
            return
        } else {
            throw UpdateError.keyNotFound(path: path, modules: modulePath)
        }
    }

Is that correct?

davidkoski · 2026-01-16T00:14:06Z

@davidkoski So MiniMax runs fine, but MiMo fails. In hindsight I should've created two separate PRs.

I don't know how many times I have said that myself :-)

Is that correct?

Yes, exactly.

ronaldmannak · 2026-01-16T00:34:53Z

I don't know how many times I have said that myself :-)

Sorry about that. I was lazy and overconfident :)

I'll update the code

ronaldmannak · 2026-01-16T01:27:51Z

@davidkoski override added. I'm calling super instead of throwing the error from within the override just in case the super implementation or error changes in the future

…ssing parameters (#336) * Add hook for Modules that need finer grained control on validating missing parameters - if validation includes .allModelKeysSet then a missing non-optional parameter will throw a validation error - modules have no clean way to fine tune this, e.g. if one particular parameter is ok to be missing - would allow a fix for ml-explore/mlx-swift-lm#50 * add CI workaround

davidkoski · 2026-01-22T20:57:30Z

OK, the corresponding change in mlx-swift is picked up, I think this just needs a rebase.

ronaldmannak · 2026-02-16T21:06:27Z

@davidkoski fixed

davidkoski

Looks great, thank you!

ronaldmannak added 3 commits January 8, 2026 20:21

Add MiniMax and MiMo v2 Flash models

c3e5f63

Update readme

b71c883

Merge branch 'main' into minimaxmimov2flash

75c83c7

davidkoski mentioned this pull request Jan 15, 2026

Add hook for Modules that need finer grained control on validating missing parameters ml-explore/mlx-swift#336

Merged

4 tasks

fix swift-lint

6bad556

Add updateMissing

989f9d5

davidkoski added the swift-format Swift format failure in CI label Feb 16, 2026

swift lint

39b8f10

davidkoski approved these changes Feb 16, 2026

View reviewed changes

davidkoski merged commit 1b876cf into ml-explore:main Feb 16, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiniMax and MiMo v2 Flash models#50

Add MiniMax and MiMo v2 Flash models#50
davidkoski merged 6 commits intoml-explore:mainfrom
PicoMLX:minimaxmimov2flash

ronaldmannak commented Jan 9, 2026

Uh oh!

ronaldmannak commented Jan 15, 2026

Uh oh!

davidkoski commented Jan 15, 2026

Uh oh!

davidkoski commented Jan 15, 2026

Uh oh!

davidkoski commented Jan 15, 2026

Uh oh!

ronaldmannak commented Jan 15, 2026

Uh oh!

ronaldmannak commented Jan 16, 2026 •

edited

Loading

Uh oh!

davidkoski commented Jan 16, 2026

Uh oh!

ronaldmannak commented Jan 16, 2026

Uh oh!

ronaldmannak commented Jan 16, 2026

Uh oh!

davidkoski commented Jan 22, 2026

Uh oh!

ronaldmannak commented Feb 16, 2026

Uh oh!

davidkoski left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ronaldmannak commented Jan 9, 2026

Proposed changes

Checklist

Uh oh!

ronaldmannak commented Jan 15, 2026

Uh oh!

davidkoski commented Jan 15, 2026

Uh oh!

davidkoski commented Jan 15, 2026

Uh oh!

davidkoski commented Jan 15, 2026

Uh oh!

ronaldmannak commented Jan 15, 2026

Uh oh!

ronaldmannak commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidkoski commented Jan 16, 2026

Uh oh!

ronaldmannak commented Jan 16, 2026

Uh oh!

ronaldmannak commented Jan 16, 2026

Uh oh!

davidkoski commented Jan 22, 2026

Uh oh!

ronaldmannak commented Feb 16, 2026

Uh oh!

davidkoski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ronaldmannak commented Jan 16, 2026 •

edited

Loading