Skip to content

Commit 32ce0e2

Browse files
committed
fix: stabilize Gemma4 MoE inference — dynamic attention mask slicing
- Update mlx-swift-lm submodule with safeMask fix for broadcast_shapes crash during sliding window attention on long prompts (1370+ tokens) - Add tmp/ to .gitignore HomeSec-Bench: 87/96 (90%) — zero server crashes across full 96-test suite
1 parent c59f6a1 commit 32ce0e2

2 files changed

Lines changed: 2 additions & 1 deletion

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ DerivedData/
2020
*.pid
2121
curl_out.txt
2222
sample.txt
23+
tmp/

0 commit comments

Comments
 (0)