Add warnings to AI image descriptions#19327
Merged
Merged
Conversation
CyrilleB79
reviewed
Dec 4, 2025
Qchristensen
approved these changes
Dec 5, 2025
Qchristensen
left a comment
Member
There was a problem hiding this comment.
Reads well, good change.
seanbudd
reviewed
Dec 5, 2025
Co-authored-by: Sean Budd <sean@nvaccess.org>
seanbudd
approved these changes
Dec 9, 2025
SaschaCowley
pushed a commit
that referenced
this pull request
Jan 11, 2026
Reverts: - #18475 - #19036 - #19024 - #19055 - #19057 - #19178 - #19243 - #19327 - Partial revert: #19342 ### Issues fixed Fixes #19298 ### Issues reopened Reopens #16281 ### Reason for revert / Can this PR be reimplemented? If so, what is required for the next attempt The current implementation of AI image descriptions yields low quality captions from a 3 year old model (see #19298). The current implementation also requires using numpy, which hogs RAM, slows initialization, and increases the weight of the installer. An attempt was made to convert this to C++ using WinML and Windows ONNX runtimes as per #18662. This would have removed numpy, and improved flexibility for using different models in the future. Unfortunately, this was not found to be feasible, as ONNX C++ fails to work via 64bit emulation on ARM (microsoft/onnxruntime#15403). This means we have the following options for image descriptions: 1. Continue to use the python onnxruntime, and accept the RAM and storage hits. Instead, improve the quality of the captioner with better models such as [git-base-coco](https://huggingface.co/microsoft/git-base-coco) or [blip2](https://huggingface.co/Salesforce/blip2-opt-2.7b-coco). 2. Wait until MS builds ARM64EC into C++ ONNX (blocked by microsoft/onnxruntime#15403) 3. Attempt to build our own fork of ONNX with ARM64EC 4. Build a separate ARM native installer of NVDA, offer as an alternative to allow for ARM devices to do image descriptions with numpy. 5. Release the feature on C++ without support for ARM devices. All of these options require a significant amount of work. As such, sadly this feature is not ready for a stable release. Instead this code will be moved to a feature branch, until ONNX C++ matures such as fixing microsoft/onnxruntime#15403. Additionally, ONNX C++ runtimes are only available through the experimental 2.0 version of the Windows App SDK, and requires you to build your own headers from it. I think this feature will be blocked until microsoft/onnxruntime#15403 is implemented and the 2.0 version of the Windows App SDK becomes stable. Future re-implementations should also consider using higher quality, more modern models.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Link to issue number:
Related to #19053
Related to #19298
Summary of the issue:
The on-device AI image descriptions introduced into NVDA tend to experience halucinations, especially when used on material other than photographs.
Description of user facing changes:
Description of developer facing changes:
None
Description of development approach:
Used a template string when outputting the AI slop so that a hedge-phrase can be used.
Wrote up a warning message for the user guide. Shortened it slightly and inserted into the settings panel and temporary enable dialog.
I have not updated
changes.md, as that would just create merge conflicts. I will do it in #19319.Testing strategy:
Ran NVDA. Checked that the settings panel speaks the warning and that the warning is visible. Checked that the warning is shown in the temp enable dialog.
Known issues with pull request:
Doesn't address the underlying issue.
The language used for this feature also seems confusing and inconsistant, but that is out of scope for this issue.
Code Review Checklist: