Skip to content

Fix race condition in clearCache causing Metal crash#331

Merged
davidkoski merged 1 commit intoml-explore:mainfrom
aleroot:main
Jan 10, 2026
Merged

Fix race condition in clearCache causing Metal crash#331
davidkoski merged 1 commit intoml-explore:mainfrom
aleroot:main

Conversation

@aleroot
Copy link
Contributor

@aleroot aleroot commented Jan 10, 2026

Proposed changes

This PR fixes a crash that occurs when clearCache() is called while an asynchronous evaluation is still encoding commands.

MLX.GPU.clearCache() (which calls Memory.clearCache()) was not synchronized with the evaluation engine. If called during an active generation loop (for example: quitting the app), it would invalidate Metal resources while the command encoder was still using them, leading to an objc_retain crash in CaptureMTLComputeCommandEncoder.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • [] I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

Copy link
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find, thank you!

@davidkoski davidkoski merged commit 5bd59d0 into ml-explore:main Jan 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants