Skip to content

Reference representation of dqlinear int4 for xnnpack#2520

Merged
kimishpatel merged 23 commits into
mainfrom
gh/kimishpatel/7/head
Aug 13, 2025
Merged

Reference representation of dqlinear int4 for xnnpack#2520
kimishpatel merged 23 commits into
mainfrom
gh/kimishpatel/7/head

Conversation

@kimishpatel

@kimishpatel kimishpatel commented Jul 10, 2025

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:

  • See if such a graph is traceable.
  • Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D78198154

tuples

Summary:
THis is needed because lists are not hashable, since they are mutable,
and as a result we cannot have literals_to_ph in pattern rewrites used
inside reference_representation_rewrite.py

Test Plan:
CI + next diff relies on this feature

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This is necessary because sometimes the patterns found have literals
include tuple of ints kind of literals. This values shouldnt be used for
pattern matching since often they are based on consts derived from
example inputs.

THis is not exactly a safe thing to do in general so by default it is
turned off

Test Plan:
Subsequent diff adds a pattern that relies on this

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jul 10, 2025

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2520

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ae63634 with merge base fe0ddf1 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kimishpatel added a commit that referenced this pull request Jul 10, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 756a9e9
Pull Request resolved: #2520
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 10, 2025
Comment on lines +41 to +42
"_qdq_dynamic_quantized_linear_4bit_groupwise",
"_reference_dynamic_quantized_linear_4bit_groupwise",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does these needs to be exposed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good point. it doesnt. ai assisted coding i guess. lol

@kimishpatel kimishpatel added the topic: new feature Use this tag if this PR adds a new feature label Jul 11, 2025
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0f79f1c
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 080923e
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: deb3efa
Pull Request resolved: #2520

@jerryzh168 jerryzh168 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll stamp to unblock, but let me know if any review is needed

@kimishpatel

Copy link
Copy Markdown
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 12, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 5108e2c
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 6, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a1e2796
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 7, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: f9db619
Pull Request resolved: #2520
@kimishpatel

Copy link
Copy Markdown
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 7, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: d1a4a2c
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 8, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 6630cc7
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0f5643a
Pull Request resolved: #2520
@kimishpatel

Copy link
Copy Markdown
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 12, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: ddb2acc
Pull Request resolved: #2520
@kimishpatel kimishpatel changed the base branch from gh/kimishpatel/7/base to main August 12, 2025 02:36
@kimishpatel

Copy link
Copy Markdown
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kimishpatel kimishpatel merged commit 6794ef5 into main Aug 13, 2025
35 of 36 checks passed
@kimishpatel kimishpatel deleted the gh/kimishpatel/7/head branch August 13, 2025 22:31
liangel-02 pushed a commit that referenced this pull request Aug 25, 2025
* When replacing literals with placeholders lists are always converted to
tuples

Summary:
THis is needed because lists are not hashable, since they are mutable,
and as a result we cannot have literals_to_ph in pattern rewrites used
inside reference_representation_rewrite.py

Test Plan:
CI + next diff relies on this feature

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Allow pattern replacement to ignore literals

Summary:
This is necessary because sometimes the patterns found have literals
include tuple of ints kind of literals. This values shouldnt be used for
pattern matching since often they are based on consts derived from
example inputs.

THis is not exactly a safe thing to do in general so by default it is
turned off

Test Plan:
Subsequent diff adds a pattern that relies on this

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Reference representation of dqlinear int4 for xnnpack

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update base for Update on "Reference representation of dqlinear int4 for xnnpack"


Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update base for Update on "Reference representation of dqlinear int4 for xnnpack"


Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update base for Update on "Reference representation of dqlinear int4 for xnnpack"


Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update base for Update on "Reference representation of dqlinear int4 for xnnpack"


Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants