Skip to content

Improve consistency in recurrent layers#312

Merged
davidkoski merged 1 commit intoml-explore:mainfrom
DePasqualeOrg:fix-recurrent-bias-parameters
Dec 16, 2025
Merged

Improve consistency in recurrent layers#312
davidkoski merged 1 commit intoml-explore:mainfrom
DePasqualeOrg:fix-recurrent-bias-parameters

Conversation

@DePasqualeOrg
Copy link
Contributor

@DePasqualeOrg DePasqualeOrg commented Dec 7, 2025

Proposed changes

Bias parameters in RNN, GRU, and LSTM were not loadable from saved weights because they lacked the @ParameterInfo attribute.

Changes:

- RNN: Add @ParameterInfo to bias, add hiddenSize property
- GRU: Add @ParameterInfo to b and bhn
- LSTM: Add @ParameterInfo to bias, add hiddenSize property

This aligns with the Python API, where bias parameters are automatically registered as loadable module parameters.

Edit: Corrected below

Checklist

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

@DePasqualeOrg DePasqualeOrg marked this pull request as ready for review December 7, 2025 17:14
@davidkoski
Copy link
Collaborator

If the @PropertyInfo is omitted both MLXArray and MLXArray? should still appear and be updatable as properties using the actual property name.

See ModuleTests, in particular ChildTestModule and TestModule. Also testTupleParameters is a more direct test of this (though more a more convoluted case).

You mentioned:

Bias parameters in RNN, GRU, and LSTM were not loadable from saved weights

I am not opposed to adding the property wrappers, but I wonder what is going on such that they wouldn't load -- can you give repro steps?

@DePasqualeOrg DePasqualeOrg force-pushed the fix-recurrent-bias-parameters branch from e0d9a61 to 35451bd Compare December 9, 2025 08:07
@DePasqualeOrg
Copy link
Contributor Author

Sorry, that was a misunderstanding on my part. @ParameterInfo doesn't need to be added where keys don't need to be remapped. The remaining changes are just for consistency: RNN's bias now uses let like other layers, and hiddenSize is exposed on all three classes, mirroring the Python API.

@DePasqualeOrg DePasqualeOrg changed the title Add @ParameterInfo to recurrent layer bias parameters Improve consistency in recurrent layers Dec 9, 2025
Copy link
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you!

@davidkoski davidkoski merged commit b36a55c into ml-explore:main Dec 16, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants