Skip to content

Optimize w8a8 kernel vmem limit#9508

Merged
vanbasten23 merged 1 commit intopytorch:masterfrom
kyuyeunk:optimize_w8a8_vmem_limit
Jul 26, 2025
Merged

Optimize w8a8 kernel vmem limit#9508
vanbasten23 merged 1 commit intopytorch:masterfrom
kyuyeunk:optimize_w8a8_vmem_limit

Conversation

@kyuyeunk
Copy link
Copy Markdown
Contributor

Use tighter lower bound for vmem limit to allow better pipelining.

Comment thread torch_xla/experimental/pallas_kernels/quantized_matmul_kernel.py Outdated
Comment thread torch_xla/experimental/pallas_kernels/quantized_matmul_kernel.py Outdated
Comment thread torch_xla/experimental/pallas_kernels/quantized_matmul_kernel.py
@kyuyeunk kyuyeunk force-pushed the optimize_w8a8_vmem_limit branch from c775f23 to 77a8a75 Compare July 25, 2025 00:37
@kyuyeunk kyuyeunk changed the title Optimize w8a8 kernel vmem limit. Optimize w8a8 kernel vmem limit Jul 25, 2025
@kyuyeunk kyuyeunk force-pushed the optimize_w8a8_vmem_limit branch 2 times, most recently from 3db386d to c87d78d Compare July 25, 2025 01:49
Comment thread torch_xla/experimental/pallas_kernels/quantized_matmul_kernel.py Outdated
Comment thread torch_xla/experimental/pallas_kernels/quantized_matmul_kernel.py Outdated
Comment thread torch_xla/experimental/pallas_kernels/quantized_matmul_kernel.py Outdated
Use tighter lower bound for vmem limit to allow better pipelining.
@kyuyeunk kyuyeunk force-pushed the optimize_w8a8_vmem_limit branch from 924c68a to de47e49 Compare July 26, 2025 00:13
@vanbasten23 vanbasten23 enabled auto-merge (squash) July 26, 2025 05:16
Copy link
Copy Markdown
Collaborator

@vanbasten23 vanbasten23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Kyuyeun.

@vanbasten23 vanbasten23 merged commit 29ae4c7 into pytorch:master Jul 26, 2025
23 of 24 checks passed
@kyuyeunk kyuyeunk deleted the optimize_w8a8_vmem_limit branch July 26, 2025 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants