support progress report for AI image descriptions download by tianzeshi-study · Pull Request #19036 · nvaccess/nvda

tianzeshi-study · 2025-10-04T14:01:23Z

Link to issue number:

Resolves #19019

Summary of the issue:

No progress of AI Image Descriptions download shown

Description of user facing changes:

User can get progress notice during AI Image descriptions model downloading

Description of developer facing changes:

Reduce timeouts in modelDownloader to avoid excessive waiting for users when the network is poor during multi-threaded downloads

remove timeout option in future.result() in modelDownloader to avoid raise timeout error when download success

Use OOP to introduce progress reporting in _localCaptioner.messageDialogs.

Description of development approach:

None

Testing strategy:

Make sure AI image descriptions models are not installed and enable AI image descriptions in settings panel. Click buttons in image descriptions message dialog to test model download by hand.

Known issues with pull request:

None

Code Review Checklist:

Documentation:
- Change log entry
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English
API is compatible with existing add-ons.
Security precautions taken.

seanbudd · 2025-10-06T04:56:38Z

Please fill out the PR description and make sure to explain the changes you made e.g. why the timeouts have changed

seanbudd

Thanks @tianzeshi-study

…19036)" This reverts commit 758d7c4.

Reverts: - #18475 - #19036 - #19024 - #19055 - #19057 - #19178 - #19243 - #19327 - Partial revert: #19342 ### Issues fixed Fixes #19298 ### Issues reopened Reopens #16281 ### Reason for revert / Can this PR be reimplemented? If so, what is required for the next attempt The current implementation of AI image descriptions yields low quality captions from a 3 year old model (see #19298). The current implementation also requires using numpy, which hogs RAM, slows initialization, and increases the weight of the installer. An attempt was made to convert this to C++ using WinML and Windows ONNX runtimes as per #18662. This would have removed numpy, and improved flexibility for using different models in the future. Unfortunately, this was not found to be feasible, as ONNX C++ fails to work via 64bit emulation on ARM (microsoft/onnxruntime#15403). This means we have the following options for image descriptions: 1. Continue to use the python onnxruntime, and accept the RAM and storage hits. Instead, improve the quality of the captioner with better models such as [git-base-coco](https://huggingface.co/microsoft/git-base-coco) or [blip2](https://huggingface.co/Salesforce/blip2-opt-2.7b-coco). 2. Wait until MS builds ARM64EC into C++ ONNX (blocked by microsoft/onnxruntime#15403) 3. Attempt to build our own fork of ONNX with ARM64EC 4. Build a separate ARM native installer of NVDA, offer as an alternative to allow for ARM devices to do image descriptions with numpy. 5. Release the feature on C++ without support for ARM devices. All of these options require a significant amount of work. As such, sadly this feature is not ready for a stable release. Instead this code will be moved to a feature branch, until ONNX C++ matures such as fixing microsoft/onnxruntime#15403. Additionally, ONNX C++ runtimes are only available through the experimental 2.0 version of the Windows App SDK, and requires you to build your own headers from it. I think this feature will be blocked until microsoft/onnxruntime#15403 is implemented and the 2.0 version of the Windows App SDK becomes stable. Future re-implementations should also consider using higher quality, more modern models.

support progress report for AI image descriptions download

4c02f2f

tianzeshi-study requested a review from a team as a code owner October 4, 2025 14:01

tianzeshi-study requested a review from SaschaCowley October 4, 2025 14:01

Pre-commit auto-fix

8959128

seanbudd requested review from seanbudd and removed request for SaschaCowley October 6, 2025 03:58

seanbudd approved these changes Oct 6, 2025

View reviewed changes

Comment thread source/gui/_localCaptioner/messageDialogs.py

Update source/gui/_localCaptioner/messageDialogs.py

5c97484

seanbudd enabled auto-merge (squash) October 6, 2025 07:28

seanbudd merged commit 758d7c4 into nvaccess:master Oct 6, 2025
29 checks passed

github-actions Bot added this to the 2026.1 milestone Oct 6, 2025

seanbudd added a commit that referenced this pull request Jan 9, 2026

Revert "support progress report for AI image descriptions download (#…

ee51e77

…19036)" This reverts commit 758d7c4.

seanbudd mentioned this pull request Jan 9, 2026

Revert AI image description work #19425

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support progress report for AI image descriptions download#19036

support progress report for AI image descriptions download#19036
seanbudd merged 3 commits into
nvaccess:masterfrom
tianzeshi-study:imageDescDownloadProgressReport

tianzeshi-study commented Oct 4, 2025 •

edited

Loading

Uh oh!

seanbudd commented Oct 6, 2025

Uh oh!

seanbudd left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tianzeshi-study commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Link to issue number:

Summary of the issue:

Description of user facing changes:

Description of developer facing changes:

Description of development approach:

Testing strategy:

Known issues with pull request:

Code Review Checklist:

Uh oh!

seanbudd commented Oct 6, 2025

Uh oh!

seanbudd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tianzeshi-study commented Oct 4, 2025 •

edited

Loading