Skip to content

Upgrade submodule oneDNN to v3.3.6 for release/2.3 (#122164)#122930

Merged
atalman merged 1 commit intopytorch:release/2.3from
Xia-Weiwen:release_2.3_onednn_3.3.6
Apr 2, 2024
Merged

Upgrade submodule oneDNN to v3.3.6 for release/2.3 (#122164)#122930
atalman merged 1 commit intopytorch:release/2.3from
Xia-Weiwen:release_2.3_onednn_3.3.6

Conversation

@Xia-Weiwen
Copy link
Copy Markdown
Collaborator

@Xia-Weiwen Xia-Weiwen commented Mar 29, 2024

Cherry-picked 481c9bb from main.

Including issue fixes for aarch64:

Including a bug fix in third_party/ideep:


Validation results

(on Intel CPU + Linux)
Static quantization with Inductor on CV models

Quant method Geomean throughput ratio (v3.3.6/baseline)
ptq 0.982937
ptq (cpp wrapper) 0.978384
qat 0.978828

Torchbench cpu userbenchmark with Inductor

Items Perf Geomean Ratio (v3.3.6/baseline)
eager_throughtput_bf16_infer 1.00x
eager_throughtput_fp32_infer 1.00x
jit_llga_throughtput_amp_bf16 1.01x
jit_llga_throughtput_fp32 1.00x
eager_throughtput_fx_int8 1.00x
eager_throughtput_bf16_train 1.46x
eager_throughtput_fp32_train 1.41x

Dynamo benchmarks tests

Precision Shape Wrapper Thread Eager old/new GEOMEAN Inductor old/new GEOMEAN
Float32 Static Default Multiple 1.003836812 1.003425 Float32

(on Aarch64)
#122164 (comment)


Pull Request resolved: #122164
Approved by: https://github.com/snadampal, https://github.com/malfet, https://github.com/atalman

cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen

As the title. Including issue fixes for aarch64:
- uxlfoundation/oneDNN#1831
- uxlfoundation/oneDNN#1834

---

## Validation results
(on Intel CPU + Linux)
**Static quantization with Inductor on CV models**

Quant method | Geomean throughput ratio (v3.3.6/baseline)
-- | --
ptq | 0.982937
ptq (cpp wrapper) | 0.978384
qat | 0.978828

**Torchbench cpu userbenchmark with Inductor**

Items | Perf Geomean Ratio (v3.3.6/baseline)
-- | --
eager_throughtput_bf16_infer | 1.00x
eager_throughtput_fp32_infer | 1.00x
jit_llga_throughtput_amp_bf16 | 1.01x
jit_llga_throughtput_fp32 | 1.00x
eager_throughtput_fx_int8 | 1.00x
eager_throughtput_bf16_train | 1.46x
eager_throughtput_fp32_train | 1.41x

**Dynamo benchmarks tests**
Precision | Shape | Wrapper | Thread | Eager old/new GEOMEAN | Inductor old/new GEOMEAN
-- | -- | -- | -- | -- | --
Float32 | Static | Default | Multiple | 1.003836812 | 1.003425
Float32 | Static | Default | Single | 1.000181451 | 0.999611
Float32 | Dynamic | Default | Multiple | 1.003980183 | 1.006563
Float32 | Dynamic | Default | Single | 1.000076939 | 0.999969
AMP | Static | Default | Multiple | 0.996824772 | 0.998715
AMP | Static | Default | Single | 0.996402574 | 1.001483
AMP | Dynamic | Default | Multiple | 0.994919866 | 1.000467
AMP | Dynamic | Default | Single | 0.9962054 | 1.000767

(on Aarch64)
pytorch#122164 (comment)

---

Pull Request resolved: pytorch#122164
Approved by: https://github.com/snadampal, https://github.com/malfet, https://github.com/atalman
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/122930

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4b3f607 with merge base 86a2d67 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration topic: not user facing topic category labels Mar 29, 2024
Copy link
Copy Markdown
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the speedups with training from oneDNN upgrade alone or somewhere else?

@Xia-Weiwen
Copy link
Copy Markdown
Collaborator Author

Are the speedups with training from oneDNN upgrade alone or somewhere else?

It's from a bug fix in ideep intel/ideep#291

@Xia-Weiwen Xia-Weiwen added the intel This tag is for PR from Intel label Mar 29, 2024
@Xia-Weiwen Xia-Weiwen marked this pull request as ready for review March 29, 2024 03:48
@jgong5
Copy link
Copy Markdown
Collaborator

jgong5 commented Mar 29, 2024

Are the speedups with training from oneDNN upgrade alone or somewhere else?

It's from a bug fix in ideep intel/ideep#291

Suggest to add this note to the upgrade here too.

@Xia-Weiwen
Copy link
Copy Markdown
Collaborator Author

Are the speedups with training from oneDNN upgrade alone or somewhere else?

It's from a bug fix in ideep intel/ideep#291

Suggest to add this note to the upgrade here too.

Sure. Added.

@atalman atalman merged commit e22b534 into pytorch:release/2.3 Apr 2, 2024
@Xia-Weiwen Xia-Weiwen deleted the release_2.3_onednn_3.3.6 branch November 13, 2024 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

intel This tag is for PR from Intel module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants