Skip to content

Set scoped vmem for paged attention#8988

Merged
tengyifei merged 6 commits intomasterfrom
piz/skip_v5e
Apr 17, 2025
Merged

Set scoped vmem for paged attention#8988
tengyifei merged 6 commits intomasterfrom
piz/skip_v5e

Conversation

@zpcore
Copy link
Copy Markdown
Member

@zpcore zpcore commented Apr 16, 2025

Fix from @bythew3i to resolve #8987.

@zpcore zpcore marked this pull request as ready for review April 16, 2025 18:03
Copy link
Copy Markdown
Contributor

@bythew3i bythew3i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! LGTM

Comment thread torch_xla/experimental/pallas_kernels/ragged_paged_attention_kernel.py Outdated
@zpcore zpcore enabled auto-merge (squash) April 16, 2025 21:02
@zpcore zpcore requested a review from tengyifei April 16, 2025 21:09
@tengyifei tengyifei requested a review from yaochengji April 16, 2025 21:13
@tengyifei
Copy link
Copy Markdown
Collaborator

I wonder if this vmem limit thing should be per-TPU-generation..

I defer to @yaochengji

@zpcore
Copy link
Copy Markdown
Member Author

zpcore commented Apr 16, 2025

I just disabled vmem OOM test for v5e since we will deprecated the ragged_paged_attention and use v2 anyway. Will cherry pick to r2.7 branch.

Copy link
Copy Markdown
Collaborator

@yaochengji yaochengji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@zpcore zpcore disabled auto-merge April 17, 2025 01:36
@tengyifei tengyifei enabled auto-merge (squash) April 17, 2025 05:40
@tengyifei tengyifei merged commit 366f248 into master Apr 17, 2025
24 checks passed
@zpcore zpcore deleted the piz/skip_v5e branch April 17, 2025 06:06
zpcore added a commit that referenced this pull request Apr 17, 2025
zpcore added a commit that referenced this pull request Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ragged paged attention test fail in v5e

4 participants