Skip to content

Conversation

@skottmckay
Copy link
Contributor

Description

Update usability checker and related infrastructure to support checking models > 2GB.

  • Add ability to set flag to keep initializers as external data
    • we optimize the model as part of the checking so need to write out a new copy.
  • Handle issue with ONNX shape inferencing silently failing
    • use API that supports large models but requires writing the model to a new file
    • automate cleanup of that copy of the model

Motivation and Context

Allow analysis of LLMs to determine gaps for mobile usage.

- Add ability to set flag to keep initializers as external data
- Handle issue with ONNX shape inferencing silently failing
@skottmckay skottmckay requested a review from edgchen1 November 9, 2023 01:52
@thiagocrepaldi
Copy link
Contributor

@skottmckay do you think this PR would also fix #14697?

@skottmckay
Copy link
Contributor Author

@skottmckay do you think this PR would also fix #14697?

It will get closer in that it would get to the point where it attempted to create the flatbuffer for the ORT format model, however the flatbuffer offsets are unsigned 32-bit int so there is at most 4GB of data that could be written out. Better than the 2GB protobuf limit though.

Not clear what the scenario is where you'd need to use an ORT format model. That implies a minimal build to save a few MB to load a model that is multiple GB. Due to that we haven't prioritized supporting these models in ORT format.

skottmckay and others added 3 commits November 15, 2023 16:35
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
@skottmckay skottmckay merged commit e7a524f into main Nov 16, 2023
@skottmckay skottmckay deleted the skottmckay/SupportLargeModelsInUsabilityChecker branch November 16, 2023 21:20
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
…soft#18357)

### Description
<!-- Describe your changes. -->
Update usability checker and related infrastructure to support checking
models > 2GB.
- Add ability to set flag to keep initializers as external data
- we optimize the model as part of the checking so need to write out a
new copy.
- Handle issue with ONNX shape inferencing silently failing
- use API that supports large models but requires writing the model to a
new file
  - automate cleanup of that copy of the model

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Allow analysis of LLMs to determine gaps for mobile usage.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants