-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Update to allow large models to be checked for mobile support. #18357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to allow large models to be checked for mobile support. #18357
Conversation
- Add ability to set flag to keep initializers as external data - Handle issue with ONNX shape inferencing silently failing
|
@skottmckay do you think this PR would also fix #14697? |
It will get closer in that it would get to the point where it attempted to create the flatbuffer for the ORT format model, however the flatbuffer offsets are unsigned 32-bit int so there is at most 4GB of data that could be written out. Better than the 2GB protobuf limit though. Not clear what the scenario is where you'd need to use an ORT format model. That implies a minimal build to save a few MB to load a model that is multiple GB. Due to that we haven't prioritized supporting these models in ORT format. |
…geModelsInUsabilityChecker
…geModelsInUsabilityChecker
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
…geModelsInUsabilityChecker
…soft#18357) ### Description <!-- Describe your changes. --> Update usability checker and related infrastructure to support checking models > 2GB. - Add ability to set flag to keep initializers as external data - we optimize the model as part of the checking so need to write out a new copy. - Handle issue with ONNX shape inferencing silently failing - use API that supports large models but requires writing the model to a new file - automate cleanup of that copy of the model ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Allow analysis of LLMs to determine gaps for mobile usage. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Description
Update usability checker and related infrastructure to support checking models > 2GB.
Motivation and Context
Allow analysis of LLMs to determine gaps for mobile usage.