Only build 64bit NVDA by seanbudd · Pull Request #18924 · nvaccess/nvda

seanbudd · 2025-09-15T00:57:33Z

Link to issue number:

Part of #16304

Summary of the issue:

We are migrating to 64bit NVDA in 2026.1

Description of user facing changes:

Switch alpha builds to 64bit

Description of developer facing changes:

Description of development approach:

Testing strategy:

Known issues with pull request:

Blocked by:

A stable installer

Code Review Checklist:

Documentation:
- Change log entry
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English
API is compatible with existing add-ons.
Security precautions taken.

Resolves #16281 Summary of the issue: NVDA currently lacks a built‑in, offline image captioning feature. Existing solutions require a reliable internet connection—raising privacy concerns, potential costs, and latency—and many NVDA users (especially in developing regions or on older hardware) have limited connectivity or constrained resources. There is no robust, integrated offline alternative. Description of user facing changes: Introduces device‑side image description directly within NVDA, requiring no cloud service. Adds three global commands (with default shortcuts): --NVDA+Windows+,--: Generate a caption for the current image under focus. --NVDA+Windows+Shift+,--: Release the loaded model and free memory. --NVDA+Windows+Ctrl+,--: Open the Model Manager GUI to download or manage models. Extends NVDA’s settings panel to enable/disable offline captioning and configure model paths. Description of developer facing changes: New _localCaptioner module containing: captioner.py: Core inference engine exposing generate_caption(image) for producing text descriptions. panel.py: NVDA settings integration (lazy or on‑startup model loading, custom path). modelDownloader.py: CLI tool to download ONNX models. modelManager.py: GUI for selecting download paths and managing available models. Uses the Hugging Face Xenova/vit-gpt2-image-captioning model in ONNX format (via onnxruntime) to balance accuracy, speed, and low resource usage. Modular design allows for future extension to additional models or formats. Description of development approach: --Modular integration--: Keeps _localCaptioner self‑contained and compatible with NVDA’s plugin architecture. --Lightweight inference--: Leverages ONNXRuntime for fast, local inference without heavy PyTorch or TensorFlow dependencies. --Lazy loading--: Model is only loaded when first invoked (or at startup, if configured), minimizing initial memory footprint. --Dual interfaces--: Provides both CLI scripts (captioner.py, modelDownloader.py) for quick tests and a GUI (modelManager.py) for end‑users. --Extensible architecture--: Configuration files (e.g., config.json) conform to Hugging Face format for easy swapping of models.

…18934) Summary of the issue: change button shown after a successful download from 'Yes' to 'OK' Description of user facing changes: user will see "OK" button to confirm that AI image descriptions is download successfully rather than "YES" button

…8945) Summary of the issue: Fixed an issue where image descriptions would download successfully but not automatically load enabled Description of user facing changes: Image descriptions will be automatically loaded after successful download

CyrilleB79 · 2025-09-19T07:50:03Z

Hi,

Trying to create a portable from the latest snapshot of this branch, the process fails with the following:

A dialog indicating the error, by the way, not very user-friendly
The following error in the log:

ERROR - gui.installerGui.doCreatePortable (09:38:30.324) - MainThread (4036):
Failed to create portable copy
Traceback (most recent call last):
  File "gui\installerGui.pyc", line 638, in doCreatePortable
  File "systemUtils.pyc", line 237, in __init__
ctypes.ArgumentError: argument 2: TypeError: expected LP_c_void_p instance instead of pointer to c_long

Running the temp copy from the installer is OK.

CyrilleB79 · 2025-09-19T07:53:29Z

Just realizing that it is fixed in #18927 merged in master.
Can latest master branch be merged in try-64bit branch please to have this fixed?

seanbudd · 2025-09-21T23:04:10Z

@CyrilleB79 - done

CyrilleB79 · 2025-09-22T07:18:03Z

Thanks @seanbudd for the new build.

Please find below issues found while testing nvda_snapshot_try-64bit-52706,70587e82:

The "Reports the text on the Windows clipboard" command (`NVDA+c`)

When I copy text on the clipboard and press NVDA+c, I get the message "There is no text on the clipboard".

Copy from virtual buffer does not work

In Chrome, I select something in the virtual buffer and press control+c. The text is not copied and the following error is logged:

IO - inputCore.InputManager.executeGesture (09:10:25.761) - winInputHook (21604):
Input: kb(desktop):control+c
ERROR - scriptHandler.executeScript (09:10:25.762) - MainThread (5840):
error executing script: <bound method CursorManager.script_copyToClipboard of <NVDAObjects.IAccessible.chromium.ChromeVBuf object at 0x0000014793C95810>> with gesture 'contrôle+c'
Traceback (most recent call last):
  File "scriptHandler.pyc", line 300, in executeScript
  File "cursorManager.pyc", line 543, in script_copyToClipboard
  File "textInfos\__init__.pyc", line 612, in copyToClipboard
  File "api.pyc", line 411, in copyToClip
  File "winUser.pyc", line 890, in setClipboardData
  File "contextlib.pyc", line 148, in __exit__
  File "winKernel.pyc", line 549, in lock
ctypes.ArgumentError: argument 1: TypeError: 'HGLOBAL' object cannot be interpreted as an integer

seanbudd · 2025-09-22T07:53:22Z

Hi @CyrilleB79 - please report these as proper issues, this PR is mainly for testing the image description work, and any other code that can only go into 64bit NVDA

…ort (#19008)

Copilot

Pull Request Overview

This PR migrates NVDA builds to 64-bit only and adds a new on-device AI Image Descriptions (local captioner) feature, with documentation, configuration, tests, and CI updates.

Switch build and CI to 64-bit only; drop x86 references.
Introduce local image captioning: ONNX-based captioner, model downloader, settings panel, global commands, docs, and comprehensive unit/system tests.

Reviewed Changes

Copilot reviewed 27 out of 30 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
user_docs/en/userGuide.md	Update OS support notes and add user docs for AI Image Descriptions and settings.
user_docs/en/changes.md	Add changelog for 2026.1 including AI Image Descriptions and 64-bit requirement.
tests/unit/test_localCaptioner/test_downloader.py	Unit tests for model downloader behavior.
tests/unit/test_localCaptioner/test_captioner.py	Unit tests for ONNX captioner pipeline and configuration parsing.
tests/system/robot/automatedImageDescriptions.robot	Robot system test for AI image descriptions.
tests/system/robot/automatedImageDescriptions.py	System test helper to render an image and trigger captioning.
tests/system/nvdaSettingsFiles/standard-doLoadMockModel.ini	Test config to enable mock model loading.
tests/system/libraries/SystemTestSpy/mockModels.py	Generate mock ONNX encoder/decoder and config/vocab for tests.
tests/system/libraries/SystemTestSpy/configManager.py	Generate mock model files into the staged NVDA profile.
source/setup.py	Packaging adjustments to include numpy for local captioning.
source/gui/settingsDialogs.py	Add AI Image Descriptions settings panel.
source/gui/_localCaptioner/messageDialogs.py	Dialogs for downloading models and handling outcomes.
source/gui/init.py	Hook settings panel into GUI.
source/globalCommands.py	Add gestures for captioning and opening the captioner settings.
source/core.py	Initialize/terminate the local captioner at startup/shutdown.
source/config/configSpec.py	Add automatedImageDescriptions section and defaults.
source/config/init.py	Include new config section in base configuration.
source/_remoteClient/transport.py	Minor docstring parameter style fix.
source/_localCaptioner/modelDownloader.py	Multi-threaded model downloader with retries and progress.
source/_localCaptioner/modelConfig.py	Dataclass-based model/preprocessor configuration parsing.
source/_localCaptioner/imageDescriber.py	Orchestration for capturing, running captioner, and messaging.
source/_localCaptioner/captioner.py	ONNX Runtime-based ViT+GPT2 captioner implementation.
source/_localCaptioner/init.py	Module lifecycle and instance management for captioner.
source/NVDAState.py	Add modelsDir path to user config write paths.
pyproject.toml	Add onnxruntime/numpy and bump sphinx; add onnx for system tests.
.python-versions	Remove 32-bit Python build target.
.github/workflows/testAndPublish.yml	Restrict arch matrix to x64 and add imageDescriptions system test suite.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-03T02:15:00Z

+				try:
+					# Use a short timeout to avoid blocking indefinitely
+					ok, msg = future.result(timeout=1.0)
+					if ok:
+						successful.append(filePath)
+						log.debug(f"successful {filePath=}")
+					else:
+						failed.append(filePath)
+						log.debug(f"failed: {filePath} - {msg}")
+				except Exception as err:
+					failed.append(filePath)
+					log.debug(f"failed: {filePath} – {err}")


Using future.result(timeout=1.0) will mark in-progress downloads as failed after 1 second. This can cause spurious failures for large files or slow connections. Replace this loop with concurrent.futures.as_completed(futures) or call future.result() without a timeout to wait for completion.

Copilot · 2025-10-03T02:15:00Z

+	def ensureModelsDirectory(self) -> str:
+		"""
+		Ensure the *models* directory exists (``../../models`` relative to *basePath*).
+
+		:return: Absolute path of the *models* directory.
+		:raises OSError: When the directory cannot be created.
+		"""
+		modelsDir = os.path.abspath(config.conf["automatedImageDescriptions"]["defaultModel"])
+
+		try:
+			Path(modelsDir).mkdir(parents=True, exist_ok=True)
+		except OSError as err:
+			raise OSError(f"Failed to create models directory {modelsDir}: {err}") from err
+		else:
+			log.debug(f"Models directory ensured: {modelsDir}")
+			return modelsDir


This creates the directory relative to the current working directory and ignores the configured models root (WritePaths.modelsDir). Build the path under the user's config directory instead, e.g. modelsDir = os.path.join(WritePaths.modelsDir, config.conf['automatedImageDescriptions']['defaultModel']). Also update the docstring which refers to a removed basePath concept.

Copilot · 2025-10-03T02:15:00Z

+	obj = api.getNavigatorObject()
+
+	# Get the object's position and size information
+	x, y, width, height = obj.location
+
+	# Create a bitmap with the same size as the object
+	bmp = wx.Bitmap(width, height)
+
+	# Create a memory device context for drawing operations on the bitmap
+	mem = wx.MemoryDC(bmp)
+
+	# Copy the specified screen region to the memory bitmap
+	mem.Blit(0, 0, width, height, wx.ScreenDC(), x, y)


Some navigator objects do not expose a location; attempting to unpack obj.location will raise. Wrap this in a try/except (e.g., AttributeError/TypeError/NotImplementedError) and report a user-friendly message (e.g., 'Object has no location') instead of raising.

Copilot · 2025-10-03T02:15:01Z

+		ui.message(pgettext("imageDesc", "Failed to generate description"))
+		log.exception("Failed to generate caption")
+	else:
+		ui.message(description)


ui.message is called from a background thread (captionThread). UI updates should be marshaled onto the GUI thread. Use wx.CallAfter(ui.message, ...) in both the exception and success paths.

Suggested change

ui.message(pgettext("imageDesc", "Failed to generate description"))

log.exception("Failed to generate caption")

else:

ui.message(description)

wx.CallAfter(ui.message, pgettext("imageDesc", "Failed to generate description"))

log.exception("Failed to generate caption")

else:

wx.CallAfter(ui.message, description)

Copilot · 2025-10-03T02:15:01Z

+			"numpy._core._exceptions",
+			"numpy._core._multiarray_umath",


These module paths are incorrect for NumPy 2.x. Use 'numpy.core._exceptions' and 'numpy.core._multiarray_umath' (without the leading underscore package). Incorrect includes will cause import errors in the frozen build.

Suggested change

"numpy._core._exceptions",

"numpy._core._multiarray_umath",

"numpy.core._exceptions",

"numpy.core._multiarray_umath",

Copilot · 2025-10-03T02:15:03Z

+Test Setup	start NVDA	standard-doLoadMockModel.ini
+Test Teardown	default teardown


NVDA is started twice: once in Test Setup and again in the test's [Setup]. Remove one of these to avoid double startup interference.

Copilot · 2025-10-03T02:15:03Z

+*** Test Cases ***
+automatedImageDescriptions
+	[Documentation]	Ensure that local captioner work
+	[Setup]	start NVDA	standard-doLoadMockModel.ini


NVDA is started twice: once in Test Setup and again in the test's [Setup]. Remove one of these to avoid double startup interference.

Suggested change

[Setup] start NVDA standard-doLoadMockModel.ini

Copilot · 2025-10-03T02:15:03Z

+*** Keywords ***
+default teardown
+	${screenshotName}=	create_preserved_test_output_filename	failedTest.png
+	Run Keyword If Test Failed	Take Screenshot	${screenShotName}


Variable name mismatch: you set ${screenshotName} but use ${screenShotName}. Use the same variable name in both lines to ensure screenshots are captured on failure.

Suggested change

Run Keyword If Test Failed Take Screenshot ${screenShotName}

Run Keyword If Test Failed Take Screenshot ${screenshotName}

Copilot · 2025-10-03T02:15:03Z

+  * Press `NVDA+Windows+,` to get an AI generated image description. (#18475, @tianzeshi-study)
+  * This is generated locally on the device - no information is sent to the internet.
+  * A new unassigned command is available for quickly opening the settings dialog for local image description. (#18475)
+  * Another new unassigned command is available for toggle image captioning. (#18475)


Correct grammar: 'Another new unassigned command is available to toggle image captioning.'

Suggested change

* Another new unassigned command is available for toggle image captioning. (#18475)

* Another new unassigned command is available to toggle image captioning. (#18475)

Copilot · 2025-10-03T02:15:04Z

+		if actualSize == 0:
+			return False, "Downloaded file is empty"
+
+		if total > 0 and actualSize != total:
+			return False, f"File incomplete: {actualSize}/{total} bytes downloaded"
+
+		# Final progress callback
+		if progressCallback and not self.cancelRequested:
+			progressCallback(fileName, actualSize, max(total, actualSize), 100.0)


Only file size is verified. Consider adding optional checksum verification (e.g., SHA-256) against a known manifest to protect against corruption/tampering when downloading model files.

seanbudd · 2025-10-03T02:17:08Z

@tianzeshi-study congratulations! Your work is now available on NVDA alphas.

Would you also mind looking into a PR to master to resolve some of CoPilots above comments?

tianzeshi-study · 2025-10-03T04:16:44Z

@tianzeshi-study congratulations! Your work is now available on NVDA alphas.

Would you also mind looking into a PR to master to resolve some of CoPilots above comments?

Ok, my pleasure.

Follow up #18924 Fixes #19033 Fixes #19039 Summary of the issue: AI Image Descriptions turned on causes NVDA to load more slowly to some extent the image description is not shown in braille Description of user facing changes: Improved NVDA startup speed to some extent when AI image descriptions is enabled. Show image description in braille Reduced NVDA memory usage to some extent. Description of developer facing changes: Improve grammar and variable name Show image description message in main thread Load image descriptioner in background

seanbudd force-pushed the try-64bit branch from 2fdc220 to 58dd147 Compare September 15, 2025 01:00

seanbudd added 2 commits September 15, 2025 11:00

Only build 64bit

58dd147

build PRs against try branches

6eb2612

seanbudd temporarily deployed to snapshot September 15, 2025 01:51 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 15, 2025 01:51 Abandoned

tianzeshi-study added 2 commits September 15, 2025 16:22

seanbudd temporarily deployed to snapshot September 17, 2025 00:04 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 17, 2025 00:04 Abandoned

SaschaCowley added the conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review. label Sep 18, 2025

seanbudd temporarily deployed to snapshot September 19, 2025 04:16 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 19, 2025 04:17 Abandoned

Merge remote-tracking branch 'origin/master' into try-64bit

cf04a74

seanbudd temporarily deployed to snapshot September 22, 2025 00:00 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 22, 2025 00:01 Abandoned

Merge remote-tracking branch 'origin/master' into try-64bit

70587e8

seanbudd temporarily deployed to snapshot September 22, 2025 02:54 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 22, 2025 02:54 Abandoned

Merge remote-tracking branch 'origin/master' into try-64bit

20fc6ee

seanbudd temporarily deployed to snapshot September 24, 2025 05:02 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 24, 2025 05:02 Abandoned

Merge branch 'master' into try-64bit

69a5b89

michaelDCurran temporarily deployed to snapshot September 24, 2025 06:16 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 24, 2025 06:16 Abandoned

Merge remote-tracking branch 'origin/master' into try-64bit

b293dc2

seanbudd temporarily deployed to snapshot September 29, 2025 00:41 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 29, 2025 00:42 Abandoned

Merge remote-tracking branch 'origin/master' into try-64bit

556e94d

seanbudd temporarily deployed to snapshot September 29, 2025 07:53 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot September 29, 2025 07:53 Abandoned

Update change log and user guide to reflect 64bit ARM Windows 10 supp…

b8ad0a6

…ort (#19008)

seanbudd temporarily deployed to snapshot October 1, 2025 04:37 — with GitHub Actions Inactive

github-actions Bot requested a deployment to snapshot October 1, 2025 04:38 Abandoned

seanbudd marked this pull request as ready for review October 3, 2025 02:09

seanbudd requested review from a team as code owners October 3, 2025 02:09

seanbudd requested review from Qchristensen, SaschaCowley and Copilot October 3, 2025 02:09

seanbudd merged commit 982db54 into master Oct 3, 2025
55 checks passed

seanbudd deleted the try-64bit branch October 3, 2025 02:09

github-actions Bot added this to the 2026.1 milestone Oct 3, 2025

Copilot AI reviewed Oct 3, 2025

View reviewed changes

tianzeshi-study mentioned this pull request Oct 3, 2025

Improve image captioner #19024

Merged

5 tasks

		Test Setup start NVDA standard-doLoadMockModel.ini
		Test Teardown default teardown

	Run Keyword If Test Failed Take Screenshot ${screenShotName}
	Run Keyword If Test Failed Take Screenshot ${screenshotName}

	* Another new unassigned command is available for toggle image captioning. (#18475)
	* Another new unassigned command is available to toggle image captioning. (#18475)

Uh oh!

Conversation

seanbudd commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Link to issue number:

Summary of the issue:

Description of user facing changes:

Description of developer facing changes:

Description of development approach:

Testing strategy:

Known issues with pull request:

Code Review Checklist:

Uh oh!

CyrilleB79 commented Sep 19, 2025

Uh oh!

CyrilleB79 commented Sep 19, 2025

Uh oh!

seanbudd commented Sep 21, 2025

Uh oh!

CyrilleB79 commented Sep 22, 2025

The "Reports the text on the Windows clipboard" command (NVDA+c)

Copy from virtual buffer does not work

Uh oh!

seanbudd commented Sep 22, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

seanbudd commented Oct 3, 2025

Uh oh!

tianzeshi-study commented Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

seanbudd commented Sep 15, 2025 •

edited

Loading

The "Reports the text on the Windows clipboard" command (`NVDA+c`)