Skip to content

[Helion + torch.compile] Add store/load transform hooks and prologue/epilogue fusion codegen#1724

Merged
yf225 merged 1 commit intomainfrom
yf225/stack/71
Mar 23, 2026
Merged

[Helion + torch.compile] Add store/load transform hooks and prologue/epilogue fusion codegen#1724
yf225 merged 1 commit intomainfrom
yf225/stack/71

Conversation

@yf225
Copy link
Copy Markdown
Contributor

@yf225 yf225 commented Mar 16, 2026

Stacked PRs:


[Helion + torch.compile] Add store/load transform hooks and prologue/epilogue fusion codegen

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 16, 2026
@yf225 yf225 marked this pull request as draft March 16, 2026 22:29
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 16, 2026 22:29
@yf225 yf225 changed the base branch from main to yf225/stack/70 March 16, 2026 22:30
@yf225 yf225 marked this pull request as ready for review March 16, 2026 22:30
@yf225 yf225 marked this pull request as draft March 16, 2026 22:36
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 16, 2026 22:36
@yf225 yf225 changed the base branch from main to yf225/stack/70 March 16, 2026 22:36
@yf225 yf225 marked this pull request as ready for review March 16, 2026 22:36
@yf225 yf225 marked this pull request as draft March 16, 2026 22:43
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 16, 2026 22:43
@yf225 yf225 changed the base branch from main to yf225/stack/70 March 16, 2026 22:44
@yf225 yf225 marked this pull request as ready for review March 16, 2026 22:44
yf225 added a commit that referenced this pull request Mar 16, 2026
…rameter plumbing

Add infrastructure for prologue/epilogue fusion in generated Triton code:
- store_transform/load_transform hooks in memory_ops for hl.store/hl.load
- Extra params, removed args, and protected arg names in codegen pipeline
- dim_index_exprs on SubscriptIndexing for epilogue fusion offsets
- _ensure_inductor_fusion_config() to enable fusion via config_patches

stack-info: PR: #1724, branch: yf225/stack/71
@yf225 yf225 marked this pull request as draft March 16, 2026 23:46
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 16, 2026 23:46
@yf225 yf225 changed the base branch from main to yf225/stack/70 March 16, 2026 23:46
@yf225 yf225 marked this pull request as ready for review March 16, 2026 23:46
yf225 added a commit that referenced this pull request Mar 16, 2026
…rameter plumbing

Add infrastructure for prologue/epilogue fusion in generated Triton code:
- store_transform/load_transform hooks in memory_ops for hl.store/hl.load
- Extra params, removed args, and protected arg names in codegen pipeline
- dim_index_exprs on SubscriptIndexing for epilogue fusion offsets
- _ensure_inductor_fusion_config() to enable fusion via config_patches

stack-info: PR: #1724, branch: yf225/stack/71
@yf225 yf225 marked this pull request as draft March 17, 2026 00:00
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 17, 2026 00:00
@yf225 yf225 changed the base branch from main to yf225/stack/70 March 17, 2026 04:56
@yf225 yf225 marked this pull request as ready for review March 17, 2026 04:56
@yf225 yf225 marked this pull request as draft March 17, 2026 04:58
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 17, 2026 04:58
@yf225 yf225 changed the base branch from main to yf225/stack/70 March 17, 2026 04:58
@yf225 yf225 marked this pull request as ready for review March 17, 2026 04:58
@yf225 yf225 requested review from jansel, oulgen and shunting314 March 17, 2026 05:20
yf225 added a commit that referenced this pull request Mar 17, 2026
…rameter plumbing

Add infrastructure for prologue/epilogue fusion in generated Triton code:
- store_transform/load_transform hooks in memory_ops for hl.store/hl.load
- Extra params, removed args, and protected arg names in codegen pipeline
- dim_index_exprs on SubscriptIndexing for epilogue fusion offsets
- _ensure_inductor_fusion_config() to enable fusion via config_patches

stack-info: PR: #1724, branch: yf225/stack/71

# Conflicts:
#	helion/_compiler/generate_ast.py
Copy link
Copy Markdown
Contributor

@shunting314 shunting314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you did in a later PR but in high level how to you pass in the load/store transform and extra params?

@yf225 yf225 marked this pull request as draft March 19, 2026 19:03
@yf225 yf225 changed the base branch from yf225/stack/70 to main March 19, 2026 19:03
@yf225 yf225 marked this pull request as ready for review March 19, 2026 19:03
yf225 added a commit that referenced this pull request Mar 19, 2026
…rameter plumbing

Add infrastructure for prologue/epilogue fusion in generated Triton code:
- store_transform/load_transform hooks in memory_ops for hl.store/hl.load
- Extra params, removed args, and protected arg names in codegen pipeline
- dim_index_exprs on SubscriptIndexing for epilogue fusion offsets
- _ensure_inductor_fusion_config() to enable fusion via config_patches

stack-info: PR: #1724, branch: yf225/stack/71

# Conflicts:
#	helion/_compiler/generate_ast.py
@yf225
Copy link
Copy Markdown
Contributor Author

yf225 commented Mar 19, 2026

maybe you did in a later PR but in high level how to you pass in the load/store transform and extra params?

Yes I now merged them into the same PR (this one), with this high-level flow:

  1. HelionTemplateBuffer._render_with_hooks() (template_buffer.py) calls kernel._setup_fusion_hooks() to populate fusion metadata (epilogue indices, prologue variables, source buffers, etc.).
  2. _generate_triton_ast() (also in template_buffer.py) checks whether fusion metadata was populated (e.g., self._epilogue_idx_by_param, self._prologue_fused_params) and conditionally passes self._codegen_epilogue_fusion as store_transform and self._codegen_prologue_fusion as load_transform to generate_ast().
  3. generate_ast() forwards them into GenerateAST.__init__() where they become self.store_transform / self.load_transform.
  4. During codegen, memory_ops.py checks state.codegen.store_transform / state.codegen.load_transform (the hooks added in this PR) and invokes them at the hl.store / hl.load call sites.

yf225 added a commit that referenced this pull request Mar 19, 2026
…rameter plumbing

Add infrastructure for prologue/epilogue fusion in generated Triton code:
- store_transform/load_transform hooks in memory_ops for hl.store/hl.load
- Extra params, removed args, and protected arg names in codegen pipeline
- dim_index_exprs on SubscriptIndexing for epilogue fusion offsets
- _ensure_inductor_fusion_config() to enable fusion via config_patches

stack-info: PR: #1724, branch: yf225/stack/71

# Conflicts:
#	helion/_compiler/generate_ast.py
@yf225 yf225 marked this pull request as draft March 19, 2026 23:35
@shunting314
Copy link
Copy Markdown
Contributor

@claude review this PR. Try to focus on making corner cases covered and make the integration between different components clean

Copy link
Copy Markdown
Contributor Author

@yf225 yf225 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Review] Overall: well-structured PR with clean callback pattern for store/load transforms. See inline comments for corner cases and integration concerns.

Copy link
Copy Markdown
Contributor Author

@yf225 yf225 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[deleted]

…epilogue fusion codegen

stack-info: PR: #1724, branch: yf225/stack/71
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants