Skip to content

Conversation

@yaoyaoding
Copy link
Member

@yaoyaoding yaoyaoding commented Sep 21, 2025

This PR adds the following tcgen05 instructions:

  • copy
  • commit

Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
.
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
@yaoyaoding yaoyaoding requested a review from Copilot September 21, 2025 03:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds the tcgen05 copy instruction to the Tilus framework, enabling copy operations between shared memory and tensor memory (TMEM) on NVIDIA GPUs. The implementation includes comprehensive layout handling, instruction emitters, and test coverage for various swizzle modes.

  • Adds tcgen05 copy instruction with comprehensive TMEM support
  • Implements shared memory layout canonicalization for tcgen05 operations
  • Reorganizes layout operations into a structured module hierarchy

Reviewed Changes

Copilot reviewed 53 out of 58 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/ir/layout/test_tcgen05_smem.py Test cases for tcgen05 shared memory layouts with various swizzle modes
tests/instructions/test_tcgen05_copy.py Integration tests for tcgen05 copy instruction functionality
python/tilus/ir/layout/cuda/tcgen05_smem.py Core implementation of tcgen05 shared memory layout canonicalization
python/tilus/backends/emitters/cuda/tcgen05/copy.py Code emitter for tcgen05 copy instructions
python/tilus/ir/instructions/cuda/tmem.py Instruction definitions for tcgen05 copy and commit operations
python/tilus/extensions/hidet/ir/primitives/cuda/tcgen05.py Low-level tcgen05 primitive functions and encoding

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
@yaoyaoding yaoyaoding requested a review from Copilot September 21, 2025 03:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 53 out of 58 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

python/tilus/ir/layout/cuda/tcgen05_smem.py:1

  • The calculation of rhs_indices is incorrect. It should use axes[i] % rhs.shape[i] not axes[i] // rhs.shape[i]. The current implementation duplicates the lhs_indices calculation.
from __future__ import annotations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@yaoyaoding yaoyaoding mentioned this pull request Sep 20, 2025
17 tasks
@yaoyaoding yaoyaoding changed the title [Instruction][tcgen05] Add copy instruction [Instruction][tcgen05] Add copy and commit instruction Sep 21, 2025
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
@yaoyaoding yaoyaoding merged commit 1480c94 into main Sep 21, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants