Skip to content

[Pallas] Make repeat_with_fixed_output_size not OOM on VMEM#7145

Merged
alanwaketan merged 4 commits intomasterfrom
alanwaketan/tgmm2
May 30, 2024
Merged

[Pallas] Make repeat_with_fixed_output_size not OOM on VMEM#7145
alanwaketan merged 4 commits intomasterfrom
alanwaketan/tgmm2

Conversation

@alanwaketan
Copy link
Copy Markdown
Collaborator

Summary:
https://openxla.org/xla/operation_semantics#reducewindow doesn't support int64. Let's make sure input to cumsum is always int32.

Test Plan:
python test/test_gmm.py
python test/test_operations.py

@alanwaketan
Copy link
Copy Markdown
Collaborator Author

Thanks Jack for approving.

@alanwaketan
Copy link
Copy Markdown
Collaborator Author

Let me just remove the cumsum thing...

@alanwaketan
Copy link
Copy Markdown
Collaborator Author

Skip GPU tests to move fast.

@alanwaketan alanwaketan merged commit ce1205e into master May 30, 2024
@alanwaketan alanwaketan deleted the alanwaketan/tgmm2 branch May 30, 2024 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants