Skip to content

OPENNLP-1707: Update ONNX Runtime to 1.21.0#752

Merged
mawiesne merged 1 commit intomainfrom
dependabot/maven/onnxruntime.version-1.21.0
Mar 10, 2025
Merged

OPENNLP-1707: Update ONNX Runtime to 1.21.0#752
mawiesne merged 1 commit intomainfrom
dependabot/maven/onnxruntime.version-1.21.0

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 10, 2025

Bumps onnxruntime.version from 1.20.0 to 1.21.0.
Updates com.microsoft.onnxruntime:onnxruntime from 1.20.0 to 1.21.0

Release notes

Sourced from com.microsoft.onnxruntime:onnxruntime's releases.

ONNX Runtime v1.21

Announcements

  • No large announcements of note this release! We've made a lot of small refinements to streamline your ONNX Runtime experience.

GenAI & Advanced Model Features

Enhanced Decoding & Pipeline Support

  • Added "chat mode" support for CPU, GPU, and WebGPU.
  • Provided support for decoder model pipelines.
  • Added support for Java API for MultiLoRA.

API & Compatibility Updates

Bug Fixes for Model Output

  • Fixed Phi series garbage output issues with long prompts.
  • Resolved gibberish issues with top_k on CPU.

Execution & Core Optimizations

Core Refinements

  • Reduced default logger usage for improved efficiency(#23030).
  • Fixed a visibility issue in theadpool (#23098).

Execution Provider (EP) Updates

General

  • Removed TVM EP from the source tree(#22827).
  • Marked NNAPI EP for deprecation (following Google's deprecation of NNAPI).
  • Fixed a DLL delay loading issue that impacts WebGPU EP and DirectML EP's usability on Windows (#23111, #23227)

TensorRT EP Improvements

  • Added support for TensorRT 10.8.
  • Assigned DDS ops (NMS, RoiAlign, NonZero) to TensorRT by default.
  • Introduced option trt_op_types_to_exclude to exclude specific ops from TensorRT assignment.

QNN EP Improvements

  • Introduced QNN shared memory support.
  • Improved performance for AI Hub models.
  • Added support for QAIRT/QNN SDK 2.31.
  • Added Python 3.13 package.
  • Miscellaneous bug fixes and enhancements.
  • QNN EP is now built as a shared library/DLL by default. To retain previous build behavior, use build option --use_qnn static_lib.

DirectML EP Support & Upgrades

  • Updated DirectML version from 1.15.2 to 1.15.4(#22635).

OpenVINO EP Improvements

  • Introduced OpenVINO EP Weights Sharing feature.
  • Added support for various contrib Ops in OVEP:
    • SkipLayerNormalization, MatMulNBits, FusedGemm, FusedConv, EmbedLayerNormalization, BiasGelu, Attention, DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization

... (truncated)

Commits
  • e0b66ca Round 2 of cherry-picks into rel-1.21.0 (#23899)
  • beb1a92 Cherry-picks into rel-1.21.0 (#23846)
  • 98511b0 Set build user's uid when creating Migraphx/ROCM docker images (#23657)
  • 23f787e [TensorRT EP] Add new provider option to exclude ops from running on TRT (#23...
  • 1b0a2ba Update cmake_cuda_architecture to control package size (#23671)
  • 8eb5513 [webgpu] Implement SubGroupMatrix based MatMulNBits for Metal (#23729)
  • d82604e [Optimizer] Fix exception for Q -> DQ sequence with different scale types (#2...
  • 754ee21 OVEP: Bug Fixes, Refactoring, and Contrib Ops Update (#23742)
  • 6715d4c Shape inference: GatherBlockQuantized dispatcher (#23748)
  • 75cf166 [QNN EP] Passthrough EP Parameters in Node (#23468)
  • Additional commits viewable in compare view

Updates com.microsoft.onnxruntime:onnxruntime_gpu from 1.20.0 to 1.21.0

Release notes

Sourced from com.microsoft.onnxruntime:onnxruntime_gpu's releases.

ONNX Runtime v1.21

Announcements

  • No large announcements of note this release! We've made a lot of small refinements to streamline your ONNX Runtime experience.

GenAI & Advanced Model Features

Enhanced Decoding & Pipeline Support

  • Added "chat mode" support for CPU, GPU, and WebGPU.
  • Provided support for decoder model pipelines.
  • Added support for Java API for MultiLoRA.

API & Compatibility Updates

Bug Fixes for Model Output

  • Fixed Phi series garbage output issues with long prompts.
  • Resolved gibberish issues with top_k on CPU.

Execution & Core Optimizations

Core Refinements

  • Reduced default logger usage for improved efficiency(#23030).
  • Fixed a visibility issue in theadpool (#23098).

Execution Provider (EP) Updates

General

  • Removed TVM EP from the source tree(#22827).
  • Marked NNAPI EP for deprecation (following Google's deprecation of NNAPI).
  • Fixed a DLL delay loading issue that impacts WebGPU EP and DirectML EP's usability on Windows (#23111, #23227)

TensorRT EP Improvements

  • Added support for TensorRT 10.8.
  • Assigned DDS ops (NMS, RoiAlign, NonZero) to TensorRT by default.
  • Introduced option trt_op_types_to_exclude to exclude specific ops from TensorRT assignment.

QNN EP Improvements

  • Introduced QNN shared memory support.
  • Improved performance for AI Hub models.
  • Added support for QAIRT/QNN SDK 2.31.
  • Added Python 3.13 package.
  • Miscellaneous bug fixes and enhancements.
  • QNN EP is now built as a shared library/DLL by default. To retain previous build behavior, use build option --use_qnn static_lib.

DirectML EP Support & Upgrades

  • Updated DirectML version from 1.15.2 to 1.15.4(#22635).

OpenVINO EP Improvements

  • Introduced OpenVINO EP Weights Sharing feature.
  • Added support for various contrib Ops in OVEP:
    • SkipLayerNormalization, MatMulNBits, FusedGemm, FusedConv, EmbedLayerNormalization, BiasGelu, Attention, DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization

... (truncated)

Commits
  • e0b66ca Round 2 of cherry-picks into rel-1.21.0 (#23899)
  • beb1a92 Cherry-picks into rel-1.21.0 (#23846)
  • 98511b0 Set build user's uid when creating Migraphx/ROCM docker images (#23657)
  • 23f787e [TensorRT EP] Add new provider option to exclude ops from running on TRT (#23...
  • 1b0a2ba Update cmake_cuda_architecture to control package size (#23671)
  • 8eb5513 [webgpu] Implement SubGroupMatrix based MatMulNBits for Metal (#23729)
  • d82604e [Optimizer] Fix exception for Q -> DQ sequence with different scale types (#2...
  • 754ee21 OVEP: Bug Fixes, Refactoring, and Contrib Ops Update (#23742)
  • 6715d4c Shape inference: GatherBlockQuantized dispatcher (#23748)
  • 75cf166 [QNN EP] Passthrough EP Parameters in Node (#23468)
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps `onnxruntime.version` from 1.20.0 to 1.21.0.

Updates `com.microsoft.onnxruntime:onnxruntime` from 1.20.0 to 1.21.0
- [Release notes](https://github.com/microsoft/onnxruntime/releases)
- [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md)
- [Commits](microsoft/onnxruntime@v1.20.0...v1.21.0)

Updates `com.microsoft.onnxruntime:onnxruntime_gpu` from 1.20.0 to 1.21.0
- [Release notes](https://github.com/microsoft/onnxruntime/releases)
- [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md)
- [Commits](microsoft/onnxruntime@v1.20.0...v1.21.0)

---
updated-dependencies:
- dependency-name: com.microsoft.onnxruntime:onnxruntime
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: com.microsoft.onnxruntime:onnxruntime_gpu
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file java Pull requests that update Java code labels Mar 10, 2025
@rzo1 rzo1 requested a review from mawiesne March 10, 2025 07:05
@mawiesne mawiesne changed the title Bump onnxruntime.version from 1.20.0 to 1.21.0 OPENNLP-1707: Update ONNX Runtime to 1.21.0 Mar 10, 2025
@mawiesne mawiesne merged commit 1016b17 into main Mar 10, 2025
10 checks passed
@mawiesne mawiesne deleted the dependabot/maven/onnxruntime.version-1.21.0 branch March 10, 2025 08:18
@mawiesne mawiesne self-assigned this Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file java Pull requests that update Java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants