Skip to content

Avoid runing AI image descriptions while screen curtain is enabled#19057

Merged
seanbudd merged 2 commits into
nvaccess:masterfrom
tianzeshi-study:fixImageDescInScreenCurtain
Oct 8, 2025
Merged

Avoid runing AI image descriptions while screen curtain is enabled#19057
seanbudd merged 2 commits into
nvaccess:masterfrom
tianzeshi-study:fixImageDescInScreenCurtain

Conversation

@tianzeshi-study

Copy link
Copy Markdown
Contributor

Link to issue number:

Fixes #19045

Summary of the issue:

Image Description not disabled while Screen Curtain is enabled

Description of user facing changes:

Users will be notified that AI image description is unavailable when they try to use it while the screen curtain is enabled.

Description of developer facing changes:

The item of SCREEN_CURTAIN is added to gui.blockAction.Context , developers can use it in decoration to avoid actions when screen curtain is enabled.

Description of development approach:

None

Testing strategy:

Use AI image descriptions wile screen curtain is enabled.

Known issues with pull request:

None

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@tianzeshi-study tianzeshi-study requested a review from a team as a code owner October 7, 2025 09:10
Comment thread source/globalCommands.py Outdated
@CyrilleB79

Copy link
Copy Markdown
Contributor

Also could you update the User Guide to mention this limitation where suitable, as suggested in #19045 (comment).

I'd also suggest to use the same decorator for OCR script, so that we get a more standard UX. It's up to NV Access to make this decision though.

Comment thread source/gui/blockAction.py
@tianzeshi-study

tianzeshi-study commented Oct 7, 2025

Copy link
Copy Markdown
Contributor Author

Also could you update the User Guide to mention this limitation where suitable, as suggested in #19045 (comment).

I'd also suggest to use the same decorator for OCR script, so that we get a more standard UX. It's up to NV Access to make this decision though.

Nice suggestion. I’ll update the code and wait for further feedback from NV Access.

@tianzeshi-study tianzeshi-study marked this pull request as draft October 7, 2025 09:53
@tianzeshi-study tianzeshi-study marked this pull request as ready for review October 7, 2025 10:04

@seanbudd seanbudd left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanbudd seanbudd merged commit 61ffb2f into nvaccess:master Oct 8, 2025
29 checks passed
@github-actions github-actions Bot added this to the 2026.1 milestone Oct 8, 2025
seanbudd added a commit that referenced this pull request Jan 9, 2026
SaschaCowley pushed a commit that referenced this pull request Jan 11, 2026
Reverts:
- #18475
- #19036
- #19024
- #19055
- #19057
- #19178
- #19243
- #19327
- Partial revert: #19342

### Issues fixed
Fixes #19298 

### Issues reopened
Reopens #16281

### Reason for revert / Can this PR be reimplemented? If so, what is
required for the next attempt

The current implementation of AI image descriptions yields low quality
captions from a 3 year old model (see #19298).
The current implementation also requires using numpy, which hogs RAM,
slows initialization, and increases the weight of the installer.
An attempt was made to convert this to C++ using WinML and Windows ONNX
runtimes as per #18662.
This would have removed numpy, and improved flexibility for using
different models in the future.
Unfortunately, this was not found to be feasible, as ONNX C++ fails to
work via 64bit emulation on ARM
(microsoft/onnxruntime#15403).

This means we have the following options for image descriptions:

1. Continue to use the python onnxruntime, and accept the RAM and
storage hits. Instead, improve the quality of the captioner with better
models such as
[git-base-coco](https://huggingface.co/microsoft/git-base-coco) or
[blip2](https://huggingface.co/Salesforce/blip2-opt-2.7b-coco).
2. Wait until MS builds ARM64EC into C++ ONNX (blocked by
microsoft/onnxruntime#15403)
3. Attempt to build our own fork of ONNX with ARM64EC
4. Build a separate ARM native installer of NVDA, offer as an
alternative to allow for ARM devices to do image descriptions with
numpy.
5. Release the feature on C++ without support for ARM devices.

All of these options require a significant amount of work.
As such, sadly this feature is not ready for a stable release.

Instead this code will be moved to a feature branch, until ONNX C++
matures such as fixing
microsoft/onnxruntime#15403.
Additionally, ONNX C++ runtimes are only available through the
experimental 2.0 version of the Windows App SDK, and requires you to
build your own headers from it.
I think this feature will be blocked until
microsoft/onnxruntime#15403 is implemented and
the 2.0 version of the Windows App SDK becomes stable.
Future re-implementations should also consider using higher quality,
more modern models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Image Description not disabled while Screen Curtain is enabled

4 participants