fix LayerNorm f16 CPU implementation by fs-eire · Pull Request #22479 · microsoft/onnxruntime

fs-eire · 2024-10-17T08:01:11Z

Description

The recent PR #22223 introduced 2 bugs in implementation of CPU LayerNorm f16:

possible access to nullptr for bias
const TensorShape& bias_shape = bias->Shape(); will crash when bias does not exist. (amazingly seems this one is not coverred by any test case)
- fix: guard with pointer check
a racing condition inside ComputeJob
ComputeJob() is dispatched to threadpool and it internally tries to modify LayerNormImpl::scale_fp32_ and LayerNormImpl::bias_fp32_, which are std::unique_ptrs and are not thread-safe.
- fix: move the modification of LayerNormImpl::scale_fp32_ and LayerNormImpl::bias_fp32_ out of ComputeJob() and put into LayerNormImpl::ComputeWithoutContext(). It may still have racing condition because ConcurrentRunSupported is set to true for CPU EP. Added an OrtMutex.

This should fixes the recent flaky tests as well.

tianleiwu

Mutex is not needed. See other comments.

fs-eire · 2024-10-17T22:24:59Z

Updated to resolve the comments.

mutex removed.
the members of unique_ptr of float* are only assigned in Prepack()
- if prepack is used, always use the stored prepacked_scale_fp32_data_ and prepacked_bias_fp32_data_.
- if prepack is not used, it means the scale and bias are not initializers. always convert from f16 to f32 for them.

### Description The recent PR #22223 introduced 2 bugs in implementation of CPU LayerNorm f16: - possible access to nullptr for bias `const TensorShape& bias_shape = bias->Shape();` will crash when `bias` does not exist. (amazingly seems this one is not coverred by any test case) - fix: guard with pointer check - a racing condition inside ComputeJob `ComputeJob()` is dispatched to threadpool and it internally tries to modify `LayerNormImpl::scale_fp32_` and `LayerNormImpl::bias_fp32_`, which are `std::unique_ptr`s and are not thread-safe. - fix: move the modification of `LayerNormImpl::scale_fp32_` and `LayerNormImpl::bias_fp32_` out of `ComputeJob()` and put into `LayerNormImpl::ComputeWithoutContext()`. It may still have racing condition because `ConcurrentRunSupported` is set to `true` for CPU EP. Added an OrtMutex. This should fixes the recent flaky tests as well.

snnn · 2025-09-05T21:27:07Z

This PR has been cherry-picked into the rel-1.20.0 branch in PR #22526. Removing the release:1.20.0 label.

fix LayerNorm f16 CPU implementation

437533b

tianleiwu reviewed Oct 17, 2024

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/nn/layer_norm_impl.h Outdated

sophies927 added the release:1.20.0 label Oct 17, 2024

amarin16 previously approved these changes Oct 17, 2024

View reviewed changes

tianleiwu reviewed Oct 17, 2024

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/nn/layer_norm_impl.h Outdated

tianleiwu reviewed Oct 17, 2024

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/nn/layer_norm_impl.cc Outdated

tianleiwu requested changes Oct 17, 2024

View reviewed changes

resolve comments

0377367

fs-eire dismissed amarin16’s stale review via 0377367 October 17, 2024 22:18

fix build

c3fcfa2

fs-eire requested a review from tianleiwu October 17, 2024 22:46

tianleiwu previously approved these changes Oct 17, 2024

View reviewed changes

tianleiwu reviewed Oct 17, 2024

View reviewed changes

Comment thread onnxruntime/test/onnx/microbenchmark/layer_normalization.cc Outdated

tianleiwu reviewed Oct 17, 2024

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/nn/layer_norm_impl.cc Outdated

add explicit cast

35a570c

fs-eire dismissed tianleiwu’s stale review via 35a570c October 17, 2024 23:10

more fixes

b2c3349

tianleiwu approved these changes Oct 17, 2024

View reviewed changes

fs-eire merged commit b4cb937 into main Oct 18, 2024

fs-eire deleted the fs-eire/fix-layernorm-cpu-f16 branch October 18, 2024 01:49

sophies927 removed the release:1.20.0 label Oct 18, 2024

snnn mentioned this pull request Oct 22, 2024

ORT 1.20.0 Release: Cherry pick round 1 #22526

Merged

sophies927 added the release:1.20.0 label Oct 22, 2024

sophies927 added the cherry-picked Cherry-picked for a cherrypicks branch label Oct 22, 2024

snnn removed the release:1.20.0 label Sep 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix LayerNorm f16 CPU implementation#22479

fix LayerNorm f16 CPU implementation#22479
fs-eire merged 5 commits intomainfrom
fs-eire/fix-layernorm-cpu-f16

fs-eire commented Oct 17, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tianleiwu left a comment

Uh oh!

fs-eire commented Oct 17, 2024

Uh oh!

Uh oh!

Uh oh!

snnn commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

fs-eire commented Oct 17, 2024

Description

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tianleiwu left a comment

Choose a reason for hiding this comment

Uh oh!

fs-eire commented Oct 17, 2024

Uh oh!

Uh oh!

Uh oh!

snnn commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants