[TRT RTX EP] EP context changes #25747

thevishalagarwal · 2025-08-14T12:03:06Z

Implements GetEPContextNodes()
Enables usage of AddExternalInitializersFromFilesInMemory for models that have to be communicated as byte stream but are larger than 2GB
Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, embed_mode=0 must be used. embed_mode=1 fails due to protobuf limitations

gedoensmax · 2025-08-14T13:46:10Z

@chilo-ms we will need a review of this :)
We disabled the option to refit for now (no weightless engines), since we want to make the interaction cleaner.

thevishalagarwal · 2025-08-14T13:47:04Z

cc @gaugarg-nv @gedoensmax @ishwar-raut1

onnxruntime/test/providers/nv_tensorrt_rtx/nv_ep_context_test.cc

gedoensmax · 2025-08-14T13:51:09Z

@jywu-msft We are adding more unit tests that i believe will also help test the compile API etc in ORT. Can we resurface the topic of running NV EP in the official ORT CI ?

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc

chilo-ms · 2025-08-15T16:16:02Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-15T16:16:23Z

Azure Pipelines successfully started running 5 pipeline(s).

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc

### Description  See the title ### Motivation and Context  Make traditional EPs (non plug-in) access OrtValue initializers. Re: #25747

chilo-ms · 2025-08-19T19:00:53Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-19T19:01:14Z

Azure Pipelines successfully started running 5 pipeline(s).

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.h

expand EP context tests

fix large model unit test

remove unused lines revert changes from main in grahp.cc

### Description  See the title ### Motivation and Context  Make traditional EPs (non plug-in) access OrtValue initializers. Re: #25747

gedoensmax · 2025-08-21T13:34:04Z

@chilo-ms Since i resolve the cases where weights are unnecessarily copied based on Dimitri's comments this should be ready to merge.

chilo-ms · 2025-08-21T15:41:39Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-21T15:42:06Z

Azure Pipelines successfully started running 5 pipeline(s).

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc

* Implements `GetEPContextNodes()` * Enables usage of `AddExternalInitializersFromFilesInMemory` for models that have to be communicated as byte stream but are larger than 2GB * Add EP context unit tests for file, bytestreams and both embed modes NOTE: For large models > 2GB, `embed_mode=0` must be used. `embed_mode=1` fails due to protobuf limitations --------- Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>

### Description Cherry-pick the following PRs into the `rel-1.23.0` branch: - #25592 - #25622 - #25688 - #25729 - #25743 - #25769 - #25745 - #25761 - #25751 - #25716 - #25228 - #25768 - #25788 - #25747 - #25800 - #25818 - #25762 - #25749 - #25831 ### Motivation and Context  --------- Co-authored-by: quic-tirupath <quic_tirupath@quicinc.com> Co-authored-by: quic-calvnguy <quic_calvnguy@quicinc.com> Co-authored-by: qti-kromero <kromero@qti.qualcomm.com> Co-authored-by: Jeff Kilpatrick <jkilpatrick@qti.qualcomm.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: David Fan <30608893+jiafatom@users.noreply.github.com> Co-authored-by: kuanyul-qti <kuanyul@qti.qualcomm.com> Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Chunye Wang@AMD <chunywan@amd.com> Co-authored-by: minfhong-qti <minfhong@qti.qualcomm.com> Co-authored-by: Vishal Agarwal <vishala@nvidia.com> Co-authored-by: Maximilian Müller <maximilianm@nvidia.com> Co-authored-by: Maximilian Müller <44298237+gedoensmax@users.noreply.github.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: adrastogi <aditya.rastogi@microsoft.com> Co-authored-by: Aditya Rastogi <adityar@ntdev.microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

### Description  See the title ### Motivation and Context  Make traditional EPs (non plug-in) access OrtValue initializers. Re: microsoft#25747

* Implements `GetEPContextNodes()` * Enables usage of `AddExternalInitializersFromFilesInMemory` for models that have to be communicated as byte stream but are larger than 2GB * Add EP context unit tests for file, bytestreams and both embed modes NOTE: For large models > 2GB, `embed_mode=0` must be used. `embed_mode=1` fails due to protobuf limitations --------- Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>

gedoensmax reviewed Aug 14, 2025

View reviewed changes

onnxruntime/test/providers/nv_tensorrt_rtx/nv_ep_context_test.cc Show resolved Hide resolved

gedoensmax reviewed Aug 14, 2025

View reviewed changes

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc Show resolved Hide resolved

yuslepukhin reviewed Aug 15, 2025

View reviewed changes

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc Outdated Show resolved Hide resolved

yuslepukhin reviewed Aug 15, 2025

View reviewed changes

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc Outdated Show resolved Hide resolved

yuslepukhin mentioned this pull request Aug 15, 2025

Expose GetOrtvalueInitializer via provider bridge #25761

Merged

jywu-msft added the ep:NvRTX NV RTX execution provider label Aug 19, 2025

jywu-msft requested review from HectorSVC and chilo-ms August 19, 2025 03:44

chilo-ms reviewed Aug 20, 2025

View reviewed changes

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.h Outdated Show resolved Hide resolved

thevishalagarwal and others added 13 commits August 20, 2025 23:03

implement GetEPContextNodes()

d9878a3

clean up

bde3ce5

rebase to latest

7165dfe

remove ctx model to just add node

7b1f5bc

update GetCapabilities for multiple EP Context Nodes

03c42fd

fix lint

43ac5d5

add support for TRT external weights API

313f4ce

add new changes

a7eadab

update external initializer fix

23c6393

fix EP name

7b1320e

reorganize unittest helpers

3b039fa

fix type tests

28d211e

basic EP context support

bd3d4ed

expand EP context tests

gedoensmax and others added 11 commits August 20, 2025 23:03

large model test

d5151d7

fix large model unit test

remove support for weightless

8de13f9

reduce header usages, cleanup and unify usage of windows ifdef

e2b67a4

remove unused lines revert changes from main in grahp.cc

address review comments

3728ce0

fix engine cache path with EP context

80574d2

fix unit test to add seed for random tensors

3996d9b

support sm86 and onwards RTX devices

a4f8c45

update cc check

b0bab1c

fix lint

f762d86

do not copy memory to EP owned memory for raw initializers

b81a6d5

use ort values i data is already loaded in memory

d0926f8

chilo-ms added the release:1.23.0 label Aug 20, 2025

gedoensmax force-pushed the ep-ctx-weights branch from ef8540a to d0926f8 Compare August 20, 2025 22:24

remove unused var

f640430

thevishalagarwal requested review from chilo-ms and yuslepukhin August 21, 2025 09:29

chilo-ms reviewed Aug 21, 2025

View reviewed changes

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc Show resolved Hide resolved

chilo-ms approved these changes Aug 21, 2025

View reviewed changes

chilo-ms merged commit 8d8a82b into microsoft:main Aug 21, 2025
98 of 100 checks passed

adrianlizarraga mentioned this pull request Aug 22, 2025

ORT 1.23.0 cherry-pick prs 25592 - 25831 #25805

Merged

matfax mentioned this pull request Aug 27, 2025

[Build] Missing thread header in TRT-RTX provider #25874

Closed

jywu-msft removed the release:1.23.0 label Aug 28, 2025

[TRT RTX EP] EP context changes #25747

[TRT RTX EP] EP context changes #25747

Uh oh!

Conversation

thevishalagarwal commented Aug 14, 2025

Uh oh!

gedoensmax commented Aug 14, 2025

Uh oh!

thevishalagarwal commented Aug 14, 2025

Uh oh!

Uh oh!

gedoensmax commented Aug 14, 2025

Uh oh!

Uh oh!

chilo-ms commented Aug 15, 2025

Uh oh!

azure-pipelines bot commented Aug 15, 2025

Uh oh!

Uh oh!

Uh oh!

chilo-ms commented Aug 19, 2025

Uh oh!

azure-pipelines bot commented Aug 19, 2025

Uh oh!

Uh oh!

gedoensmax commented Aug 21, 2025

Uh oh!

chilo-ms commented Aug 21, 2025

Uh oh!

azure-pipelines bot commented Aug 21, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants