Skip to content

Cherry pick four attention PRs#7315

Merged
justinchuby merged 4 commits intorel-1.19.1from
justinchu/pick-attention
Sep 22, 2025
Merged

Cherry pick four attention PRs#7315
justinchuby merged 4 commits intorel-1.19.1from
justinchu/pick-attention

Conversation

titaiwangms and others added 4 commits September 22, 2025 12:04
### Description

Replace tile with repeat interleave in repeat KV in attention

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

In GQA, we need to match kv to multiple q, and previous to this PR, we
used tile, which gave us [head0, head1, head0, head1] combinations.
However, in torch, it uses repeat interleave ([head0, head0, head1,
head1]).


https://github.com/pytorch/pytorch/blob/734ce8eba9c69381f187359bf0fef1d71d84cd20/torch/nn/functional.py#L5833-L5835

---------

Signed-off-by: Ti-Tai Wang <titaiwang@microsoft.com>
… following opset 24 specs) (#7282)

### Description
Force the opset to be 23 in all backtest related to Attention except 1
following opset 24 definition.

### Motivation and Context
A runtime only implementing opset 23 cannot be tested otherwise (without
tweaking).

---------

Signed-off-by: xadupre <xadupre@microsoft.com>
### Description
Fix causal mask for attention.

Another unrelated change was needed to achieve that goal: when the
automated backend test produces the expanded version of a test (meaning
the operator can be replaced by a function), it was done without
considering the opset. This could lead to a wrong model. Now, the
expanded test takes the function definition coming from the same opset
than the one used in the test.

### Motivation and Context
Past should not be impacted by the causal mask.

---------

Signed-off-by: xadupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupré <xadupre@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

Signed-off-by: xadupre <xadupre@microsoft.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 22, 2025

Codecov Report

❌ Patch coverage is 87.09677% with 4 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (rel-1.19.1@cb20f6f). Learn more about missing BASE report.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
onnx/backend/test/case/node/__init__.py 33.33% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             rel-1.19.1    #7315   +/-   ##
=============================================
  Coverage              ?   53.77%           
=============================================
  Files                 ?      512           
  Lines                 ?    32214           
  Branches              ?     2946           
=============================================
  Hits                  ?    17322           
  Misses                ?    14121           
  Partials              ?      771           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-project-automation github-project-automation Bot moved this from In progress to Reviewer approved in PR Tracker Sep 22, 2025
@justinchuby justinchuby merged commit fb51738 into rel-1.19.1 Sep 22, 2025
34 of 36 checks passed
@justinchuby justinchuby deleted the justinchu/pick-attention branch September 22, 2025 19:21
@github-project-automation github-project-automation Bot moved this from Reviewer approved to Done in PR Tracker Sep 22, 2025
@yuanyao-nv yuanyao-nv mentioned this pull request Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants