Skip to content

Conversation

@thevishalagarwal
Copy link
Contributor

  • Implements GetEPContextNodes()
  • Enables usage of AddExternalInitializersFromFilesInMemory for models that have to be communicated as byte stream but are larger than 2GB
  • Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, embed_mode=0 must be used. embed_mode=1 fails due to protobuf limitations

@gedoensmax
Copy link
Contributor

@chilo-ms we will need a review of this :)
We disabled the option to refit for now (no weightless engines), since we want to make the interaction cleaner.

@thevishalagarwal
Copy link
Contributor Author

@gedoensmax
Copy link
Contributor

@jywu-msft We are adding more unit tests that i believe will also help test the compile API etc in ORT. Can we resurface the topic of running NV EP in the official ORT CI ?

@chilo-ms
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

yuslepukhin added a commit that referenced this pull request Aug 18, 2025
### Description
<!-- Describe your changes. -->
See the title


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make traditional EPs (non plug-in) access OrtValue initializers.

Re: #25747
@jywu-msft jywu-msft added the ep:NvRTX NV RTX execution provider label Aug 19, 2025
@chilo-ms
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

adrianlizarraga pushed a commit that referenced this pull request Aug 21, 2025
### Description
<!-- Describe your changes. -->
See the title


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make traditional EPs (non plug-in) access OrtValue initializers.

Re: #25747
@gedoensmax
Copy link
Contributor

@chilo-ms Since i resolve the cases where weights are unnecessarily copied based on Dimitri's comments this should be ready to merge.

@chilo-ms
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@chilo-ms chilo-ms merged commit 8d8a82b into microsoft:main Aug 21, 2025
98 of 100 checks passed
adrianlizarraga pushed a commit that referenced this pull request Aug 22, 2025
* Implements `GetEPContextNodes()`
* Enables usage of `AddExternalInitializersFromFilesInMemory` for models
that have to be communicated as byte stream but are larger than 2GB
* Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, `embed_mode=0` must be used.
`embed_mode=1` fails due to protobuf limitations

---------

Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>
adrianlizarraga added a commit that referenced this pull request Aug 25, 2025
### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:
- #25592
- #25622
- #25688
- #25729
- #25743
- #25769
- #25745
- #25761
- #25751
- #25716
- #25228
- #25768
- #25788
- #25747
- #25800
- #25818
- #25762
- #25749
- #25831


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: quic-tirupath <quic_tirupath@quicinc.com>
Co-authored-by: quic-calvnguy <quic_calvnguy@quicinc.com>
Co-authored-by: qti-kromero <kromero@qti.qualcomm.com>
Co-authored-by: Jeff Kilpatrick <jkilpatrick@qti.qualcomm.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: David Fan <30608893+jiafatom@users.noreply.github.com>
Co-authored-by: kuanyul-qti <kuanyul@qti.qualcomm.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Chunye Wang@AMD <chunywan@amd.com>
Co-authored-by: minfhong-qti <minfhong@qti.qualcomm.com>
Co-authored-by: Vishal Agarwal <vishala@nvidia.com>
Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>
Co-authored-by: Maximilian Müller <44298237+gedoensmax@users.noreply.github.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
Co-authored-by: adrastogi <aditya.rastogi@microsoft.com>
Co-authored-by: Aditya Rastogi <adityar@ntdev.microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
gedoensmax pushed a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
### Description
<!-- Describe your changes. -->
See the title


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make traditional EPs (non plug-in) access OrtValue initializers.

Re: microsoft#25747
gedoensmax added a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
* Implements `GetEPContextNodes()`
* Enables usage of `AddExternalInitializersFromFilesInMemory` for models
that have to be communicated as byte stream but are larger than 2GB
* Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, `embed_mode=0` must be used.
`embed_mode=1` fails due to protobuf limitations

---------

Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:NvRTX NV RTX execution provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants