Skip to content

Conversation

@fanchenkong1
Copy link
Contributor

Description

Use wasm_f32x4_relaxed_max and wasm_f32x4_relaxed_min in WASM relaxed SIMD build.

Motivation and Context

This PR replaces wasm_f32x4_min/max with the relaxed SIMD counterparts wasm_f32x4_relaxed_min/max in WASM relaxed SIMD build.

According to relaxed SIMD proposal, the wasm_f32x4_relaxed_min/max allow implementation-defined behavior on NaN propagation and -0.0 vs +0.0. This enables WASM runtimes to use minps/maxps on x64 platforms and improves the performance.

e.g. for wasm_f32x4_max -> wasm_f32x4_relaxed_max
wasm_f32x4_max: implementation in V8
wasm_f32x4_relaxed_max: maxps

This change would affect kernel functions rely on MlasMaximumFloat32x4 and MlasMinimumFloat32x4, including various activations and reduced min/max kernels. In mlas micro bench "COMPUTESOFTMAXINPLACE...", this change provides a performance improvement of up to 60% on x64 devices.

@fanchenkong1 fanchenkong1 requested a review from a team as a code owner April 7, 2025 05:42
@guschmue
Copy link
Contributor

guschmue commented Apr 7, 2025

/azp Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Command 'Linux' is not supported by Azure Pipelines.

Supported commands
  • help:
    • Get descriptions, examples and documentation about supported commands
    • Example: help "command_name"
  • list:
    • List all pipelines for this repository using a comment.
    • Example: "list"
  • run:
    • Run all pipelines or specific pipelines for this repository using a comment. Use this command by itself to trigger all related pipelines, or specify specific pipelines to run.
    • Example: "run" or "run pipeline_name, pipeline_name, pipeline_name"
  • where:
    • Report back the Azure DevOps orgs that are related to this repository and org
    • Example: "where"

See additional documentation.

@guschmue
Copy link
Contributor

guschmue commented Apr 7, 2025

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@fanchenkong1
Copy link
Contributor Author

@guschmue , Thank you for reviewing this change!

This change requires a code review from microsoft/onnxruntime-mlas. @jywu-msft , Could you please help suggest a reviewer or assist in moving it forward? Thank you!

@guschmue guschmue merged commit 04e0b50 into microsoft:main Apr 8, 2025
69 checks passed
zhaoxul-qti pushed a commit to CodeLinaro/onnxruntime that referenced this pull request Apr 17, 2025
### Description
Use wasm_f32x4_relaxed_max and wasm_f32x4_relaxed_min in WASM relaxed
SIMD build.


### Motivation and Context
This PR replaces wasm_f32x4_min/max with the relaxed SIMD counterparts
wasm_f32x4_relaxed_min/max in WASM relaxed SIMD build.

According to [relaxed SIMD
proposal](https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#relaxed-min-and-max),
the wasm_f32x4_relaxed_min/max allow implementation-defined behavior on
NaN propagation and -0.0 vs +0.0. This enables WASM runtimes to use
minps/maxps on x64 platforms and improves the performance.

e.g. for wasm_f32x4_max -> wasm_f32x4_relaxed_max
wasm_f32x4_max: [implementation in
V8](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/codegen/shared-ia32-x64/macro-assembler-shared-ia32-x64.cc;l=231)
wasm_f32x4_relaxed_max: maxps

This change would affect kernel functions rely on MlasMaximumFloat32x4
and MlasMinimumFloat32x4, including various activations and reduced
min/max kernels. In mlas micro bench "COMPUTESOFTMAXINPLACE...", this
change provides a performance improvement of up to 60% on x64 devices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants