Add MiniMax and MiMo v2 Flash models#50
Conversation
|
@davidkoski This PR builds without issues using MLX 0.30.2 and is ready for review and to test on a device with more memory than I have. |
|
Looks like swift-format wants some changes. |
|
Here is a MiniMax run: MiMo is still downloading :-) |
|
MiMo fails like this: It looks like this part from the python code wasn't ported: if self.has_sinks:
self.attention_sink_bias = mx.ones((self.n_heads,))
else:
self.attention_sink_bias = Nonethough I think this may be a case where it keeps the default We don't have a great way to handle this in mlx-swift. If the parameter isn't named with a leading Some ways we might do it:
Assuming this is correct behavior, this is the first time it has come up. |
…ssing parameters - if validation includes .allModelKeysSet then a missing non-optional parameter will throw a validation error - modules have no clean way to fine tune this, e.g. if one particular parameter is ok to be missing - would allow a fix for ml-explore/mlx-swift-lm#50
This I can fix :) |
|
@davidkoski So MiniMax runs fine, but MiMo fails. In hindsight I should've created two separate PRs. If I understand you correctly, once #336 is merged, we could override updateMissing in MiMoV2Flash like so: override func updateMissing(
parameter: String,
verify: VerifyUpdate,
path: [String],
modulePath: [String]
) throws {
if parameter == "attention_sink_bias", hasSinks {
// Weight file omits it; keep the default `ones` already constructed.
return
} else {
throw UpdateError.keyNotFound(path: path, modules: modulePath)
}
}Is that correct? |
I don't know how many times I have said that myself :-)
Yes, exactly. |
Sorry about that. I was lazy and overconfident :) I'll update the code |
|
@davidkoski override added. I'm calling super instead of throwing the error from within the override just in case the super implementation or error changes in the future |
…ssing parameters (#336) * Add hook for Modules that need finer grained control on validating missing parameters - if validation includes .allModelKeysSet then a missing non-optional parameter will throw a validation error - modules have no clean way to fine tune this, e.g. if one particular parameter is ok to be missing - would allow a fix for ml-explore/mlx-swift-lm#50 * add CI workaround
|
OK, the corresponding change in mlx-swift is picked up, I think this just needs a rebase. |
|
@davidkoski fixed |
davidkoski
left a comment
There was a problem hiding this comment.
Looks great, thank you!
Proposed changes
This PR includes two ports: MiniMax and MiMo v2 Flash. As with the previous PR, I don’t have enough memory available to run a full verification of the models. However, I reviewed the implementation and checked it against the reference Python code
Checklist
pre-commit run --all-filesto format my code / installed pre-commit prior to committing changes