Skip to content

Only build 64bit NVDA#18924

Merged
seanbudd merged 12 commits into
masterfrom
try-64bit
Oct 3, 2025
Merged

Only build 64bit NVDA#18924
seanbudd merged 12 commits into
masterfrom
try-64bit

Conversation

@seanbudd

@seanbudd seanbudd commented Sep 15, 2025

Copy link
Copy Markdown
Member

Link to issue number:

Part of #16304

Summary of the issue:

We are migrating to 64bit NVDA in 2026.1

Description of user facing changes:

Switch alpha builds to 64bit

Description of developer facing changes:

Description of development approach:

Testing strategy:

Known issues with pull request:

Blocked by:

  • A stable installer

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@github-actions github-actions Bot requested a deployment to snapshot September 15, 2025 01:51 Abandoned
Resolves #16281
Summary of the issue:

NVDA currently lacks a built‑in, offline image captioning feature. Existing solutions require a reliable internet connection—raising privacy concerns, potential costs, and latency—and many NVDA users (especially in developing regions or on older hardware) have limited connectivity or constrained resources. There is no robust, integrated offline alternative.
Description of user facing changes:

    Introduces device‑side image description directly within NVDA, requiring no cloud service.

    Adds three global commands (with default shortcuts):
        --NVDA+Windows+,--: Generate a caption for the current image under focus.
        --NVDA+Windows+Shift+,--: Release the loaded model and free memory.
        --NVDA+Windows+Ctrl+,--: Open the Model Manager GUI to download or manage models.

    Extends NVDA’s settings panel to enable/disable offline captioning and configure model paths.

Description of developer facing changes:

    New _localCaptioner module containing:
        captioner.py: Core inference engine exposing generate_caption(image) for producing text descriptions.
        panel.py: NVDA settings integration (lazy or on‑startup model loading, custom path).
        modelDownloader.py: CLI tool to download ONNX models.
        modelManager.py: GUI for selecting download paths and managing available models.

    Uses the Hugging Face Xenova/vit-gpt2-image-captioning model in ONNX format (via onnxruntime) to balance accuracy, speed, and low resource usage.

    Modular design allows for future extension to additional models or formats.

Description of development approach:

    --Modular integration--: Keeps _localCaptioner self‑contained and compatible with NVDA’s plugin architecture.
    --Lightweight inference--: Leverages ONNXRuntime for fast, local inference without heavy PyTorch or TensorFlow dependencies.
    --Lazy loading--: Model is only loaded when first invoked (or at startup, if configured), minimizing initial memory footprint.
    --Dual interfaces--: Provides both CLI scripts (captioner.py, modelDownloader.py) for quick tests and a GUI (modelManager.py) for end‑users.
    --Extensible architecture--: Configuration files (e.g., config.json) conform to Hugging Face format for easy swapping of models.
…18934)

Summary of the issue:

change button shown after a successful download from 'Yes' to 'OK'
Description of user facing changes:

user will see "OK" button to confirm that AI image descriptions is download successfully rather than "YES" button
@github-actions github-actions Bot requested a deployment to snapshot September 17, 2025 00:04 Abandoned
@SaschaCowley SaschaCowley added the conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review. label Sep 18, 2025
…8945)

Summary of the issue:
Fixed an issue where image descriptions would download successfully but not automatically load enabled

Description of user facing changes:
Image descriptions will be automatically loaded after successful download
@github-actions github-actions Bot requested a deployment to snapshot September 19, 2025 04:17 Abandoned
@CyrilleB79

Copy link
Copy Markdown
Contributor

Hi,

Trying to create a portable from the latest snapshot of this branch, the process fails with the following:

  • A dialog indicating the error, by the way, not very user-friendly
  • The following error in the log:
ERROR - gui.installerGui.doCreatePortable (09:38:30.324) - MainThread (4036):
Failed to create portable copy
Traceback (most recent call last):
  File "gui\installerGui.pyc", line 638, in doCreatePortable
  File "systemUtils.pyc", line 237, in __init__
ctypes.ArgumentError: argument 2: TypeError: expected LP_c_void_p instance instead of pointer to c_long

Running the temp copy from the installer is OK.

@CyrilleB79

Copy link
Copy Markdown
Contributor

Just realizing that it is fixed in #18927 merged in master.
Can latest master branch be merged in try-64bit branch please to have this fixed?

@seanbudd

Copy link
Copy Markdown
Member Author

@CyrilleB79 - done

@github-actions github-actions Bot requested a deployment to snapshot September 22, 2025 00:01 Abandoned
@github-actions github-actions Bot requested a deployment to snapshot September 22, 2025 02:54 Abandoned
@CyrilleB79

Copy link
Copy Markdown
Contributor

Thanks @seanbudd for the new build.

Please find below issues found while testing nvda_snapshot_try-64bit-52706,70587e82:

The "Reports the text on the Windows clipboard" command (NVDA+c)

When I copy text on the clipboard and press NVDA+c, I get the message "There is no text on the clipboard".

Copy from virtual buffer does not work

In Chrome, I select something in the virtual buffer and press control+c. The text is not copied and the following error is logged:

IO - inputCore.InputManager.executeGesture (09:10:25.761) - winInputHook (21604):
Input: kb(desktop):control+c
ERROR - scriptHandler.executeScript (09:10:25.762) - MainThread (5840):
error executing script: <bound method CursorManager.script_copyToClipboard of <NVDAObjects.IAccessible.chromium.ChromeVBuf object at 0x0000014793C95810>> with gesture 'contrôle+c'
Traceback (most recent call last):
  File "scriptHandler.pyc", line 300, in executeScript
  File "cursorManager.pyc", line 543, in script_copyToClipboard
  File "textInfos\__init__.pyc", line 612, in copyToClipboard
  File "api.pyc", line 411, in copyToClip
  File "winUser.pyc", line 890, in setClipboardData
  File "contextlib.pyc", line 148, in __exit__
  File "winKernel.pyc", line 549, in lock
ctypes.ArgumentError: argument 1: TypeError: 'HGLOBAL' object cannot be interpreted as an integer

@seanbudd

Copy link
Copy Markdown
Member Author

Hi @CyrilleB79 - please report these as proper issues, this PR is mainly for testing the image description work, and any other code that can only go into 64bit NVDA

@github-actions github-actions Bot requested a deployment to snapshot September 24, 2025 05:02 Abandoned
@github-actions github-actions Bot requested a deployment to snapshot September 24, 2025 06:16 Abandoned
@github-actions github-actions Bot requested a deployment to snapshot September 29, 2025 00:42 Abandoned
@github-actions github-actions Bot requested a deployment to snapshot September 29, 2025 07:53 Abandoned
@github-actions github-actions Bot requested a deployment to snapshot October 1, 2025 04:38 Abandoned
@seanbudd seanbudd marked this pull request as ready for review October 3, 2025 02:09
@seanbudd seanbudd requested review from a team as code owners October 3, 2025 02:09
@seanbudd seanbudd merged commit 982db54 into master Oct 3, 2025
55 checks passed
@seanbudd seanbudd deleted the try-64bit branch October 3, 2025 02:09
@github-actions github-actions Bot added this to the 2026.1 milestone Oct 3, 2025

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates NVDA builds to 64-bit only and adds a new on-device AI Image Descriptions (local captioner) feature, with documentation, configuration, tests, and CI updates.

  • Switch build and CI to 64-bit only; drop x86 references.
  • Introduce local image captioning: ONNX-based captioner, model downloader, settings panel, global commands, docs, and comprehensive unit/system tests.

Reviewed Changes

Copilot reviewed 27 out of 30 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
user_docs/en/userGuide.md Update OS support notes and add user docs for AI Image Descriptions and settings.
user_docs/en/changes.md Add changelog for 2026.1 including AI Image Descriptions and 64-bit requirement.
tests/unit/test_localCaptioner/test_downloader.py Unit tests for model downloader behavior.
tests/unit/test_localCaptioner/test_captioner.py Unit tests for ONNX captioner pipeline and configuration parsing.
tests/system/robot/automatedImageDescriptions.robot Robot system test for AI image descriptions.
tests/system/robot/automatedImageDescriptions.py System test helper to render an image and trigger captioning.
tests/system/nvdaSettingsFiles/standard-doLoadMockModel.ini Test config to enable mock model loading.
tests/system/libraries/SystemTestSpy/mockModels.py Generate mock ONNX encoder/decoder and config/vocab for tests.
tests/system/libraries/SystemTestSpy/configManager.py Generate mock model files into the staged NVDA profile.
source/setup.py Packaging adjustments to include numpy for local captioning.
source/gui/settingsDialogs.py Add AI Image Descriptions settings panel.
source/gui/_localCaptioner/messageDialogs.py Dialogs for downloading models and handling outcomes.
source/gui/init.py Hook settings panel into GUI.
source/globalCommands.py Add gestures for captioning and opening the captioner settings.
source/core.py Initialize/terminate the local captioner at startup/shutdown.
source/config/configSpec.py Add automatedImageDescriptions section and defaults.
source/config/init.py Include new config section in base configuration.
source/_remoteClient/transport.py Minor docstring parameter style fix.
source/_localCaptioner/modelDownloader.py Multi-threaded model downloader with retries and progress.
source/_localCaptioner/modelConfig.py Dataclass-based model/preprocessor configuration parsing.
source/_localCaptioner/imageDescriber.py Orchestration for capturing, running captioner, and messaging.
source/_localCaptioner/captioner.py ONNX Runtime-based ViT+GPT2 captioner implementation.
source/_localCaptioner/init.py Module lifecycle and instance management for captioner.
source/NVDAState.py Add modelsDir path to user config write paths.
pyproject.toml Add onnxruntime/numpy and bump sphinx; add onnx for system tests.
.python-versions Remove 32-bit Python build target.
.github/workflows/testAndPublish.yml Restrict arch matrix to x64 and add imageDescriptions system test suite.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +734 to +745
try:
# Use a short timeout to avoid blocking indefinitely
ok, msg = future.result(timeout=1.0)
if ok:
successful.append(filePath)
log.debug(f"successful {filePath=}")
else:
failed.append(filePath)
log.debug(f"failed: {filePath} - {msg}")
except Exception as err:
failed.append(filePath)
log.debug(f"failed: {filePath} – {err}")

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using future.result(timeout=1.0) will mark in-progress downloads as failed after 1 second. This can cause spurious failures for large files or slow connections. Replace this loop with concurrent.futures.as_completed(futures) or call future.result() without a timeout to wait for completion.

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +115
def ensureModelsDirectory(self) -> str:
"""
Ensure the *models* directory exists (``../../models`` relative to *basePath*).

:return: Absolute path of the *models* directory.
:raises OSError: When the directory cannot be created.
"""
modelsDir = os.path.abspath(config.conf["automatedImageDescriptions"]["defaultModel"])

try:
Path(modelsDir).mkdir(parents=True, exist_ok=True)
except OSError as err:
raise OSError(f"Failed to create models directory {modelsDir}: {err}") from err
else:
log.debug(f"Models directory ensured: {modelsDir}")
return modelsDir

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates the directory relative to the current working directory and ignores the configured models root (WritePaths.modelsDir). Build the path under the user's config directory instead, e.g. modelsDir = os.path.join(WritePaths.modelsDir, config.conf['automatedImageDescriptions']['defaultModel']). Also update the docstring which refers to a removed basePath concept.

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +52
obj = api.getNavigatorObject()

# Get the object's position and size information
x, y, width, height = obj.location

# Create a bitmap with the same size as the object
bmp = wx.Bitmap(width, height)

# Create a memory device context for drawing operations on the bitmap
mem = wx.MemoryDC(bmp)

# Copy the specified screen region to the memory bitmap
mem.Blit(0, 0, width, height, wx.ScreenDC(), x, y)

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some navigator objects do not expose a location; attempting to unpack obj.location will raise. Wrap this in a try/except (e.g., AttributeError/TypeError/NotImplementedError) and report a user-friendly message (e.g., 'Object has no location') instead of raising.

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +81
ui.message(pgettext("imageDesc", "Failed to generate description"))
log.exception("Failed to generate caption")
else:
ui.message(description)

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ui.message is called from a background thread (captionThread). UI updates should be marshaled onto the GUI thread. Use wx.CallAfter(ui.message, ...) in both the exception and success paths.

Suggested change
ui.message(pgettext("imageDesc", "Failed to generate description"))
log.exception("Failed to generate caption")
else:
ui.message(description)
wx.CallAfter(ui.message, pgettext("imageDesc", "Failed to generate description"))
log.exception("Failed to generate caption")
else:
wx.CallAfter(ui.message, description)

Copilot uses AI. Check for mistakes.
Comment thread source/setup.py
Comment on lines +257 to +258
"numpy._core._exceptions",
"numpy._core._multiarray_umath",

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These module paths are incorrect for NumPy 2.x. Use 'numpy.core._exceptions' and 'numpy.core._multiarray_umath' (without the leading underscore package). Incorrect includes will cause import errors in the frozen build.

Suggested change
"numpy._core._exceptions",
"numpy._core._multiarray_umath",
"numpy.core._exceptions",
"numpy.core._multiarray_umath",

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +14
Test Setup start NVDA standard-doLoadMockModel.ini
Test Teardown default teardown

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVDA is started twice: once in Test Setup and again in the test's [Setup]. Remove one of these to avoid double startup interference.

Copilot uses AI. Check for mistakes.
*** Test Cases ***
automatedImageDescriptions
[Documentation] Ensure that local captioner work
[Setup] start NVDA standard-doLoadMockModel.ini

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVDA is started twice: once in Test Setup and again in the test's [Setup]. Remove one of these to avoid double startup interference.

Suggested change
[Setup] start NVDA standard-doLoadMockModel.ini

Copilot uses AI. Check for mistakes.
*** Keywords ***
default teardown
${screenshotName}= create_preserved_test_output_filename failedTest.png
Run Keyword If Test Failed Take Screenshot ${screenShotName}

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name mismatch: you set ${screenshotName} but use ${screenShotName}. Use the same variable name in both lines to ensure screenshots are captured on failure.

Suggested change
Run Keyword If Test Failed Take Screenshot ${screenShotName}
Run Keyword If Test Failed Take Screenshot ${screenshotName}

Copilot uses AI. Check for mistakes.
Comment thread user_docs/en/changes.md
* Press `NVDA+Windows+,` to get an AI generated image description. (#18475, @tianzeshi-study)
* This is generated locally on the device - no information is sent to the internet.
* A new unassigned command is available for quickly opening the settings dialog for local image description. (#18475)
* Another new unassigned command is available for toggle image captioning. (#18475)

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct grammar: 'Another new unassigned command is available to toggle image captioning.'

Suggested change
* Another new unassigned command is available for toggle image captioning. (#18475)
* Another new unassigned command is available to toggle image captioning. (#18475)

Copilot uses AI. Check for mistakes.
Comment on lines +599 to +607
if actualSize == 0:
return False, "Downloaded file is empty"

if total > 0 and actualSize != total:
return False, f"File incomplete: {actualSize}/{total} bytes downloaded"

# Final progress callback
if progressCallback and not self.cancelRequested:
progressCallback(fileName, actualSize, max(total, actualSize), 100.0)

Copilot AI Oct 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only file size is verified. Consider adding optional checksum verification (e.g., SHA-256) against a known manifest to protect against corruption/tampering when downloading model files.

Copilot uses AI. Check for mistakes.
@seanbudd

seanbudd commented Oct 3, 2025

Copy link
Copy Markdown
Member Author

@tianzeshi-study congratulations! Your work is now available on NVDA alphas.

Would you also mind looking into a PR to master to resolve some of CoPilots above comments?

@tianzeshi-study

Copy link
Copy Markdown
Contributor

@tianzeshi-study congratulations! Your work is now available on NVDA alphas.

Would you also mind looking into a PR to master to resolve some of CoPilots above comments?

Ok, my pleasure.

@tianzeshi-study tianzeshi-study mentioned this pull request Oct 3, 2025
5 tasks
seanbudd pushed a commit that referenced this pull request Oct 6, 2025
Follow up #18924
Fixes #19033
Fixes #19039
Summary of the issue:

AI Image Descriptions turned on causes NVDA to load more slowly to some extent

the image description is not shown in braille
Description of user facing changes:

Improved NVDA startup speed to some extent when AI image descriptions is enabled.

Show image description in braille

Reduced NVDA memory usage to some extent.
Description of developer facing changes:

Improve grammar and variable name

Show image description message in main thread

Load image descriptioner in background
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants