Skip to content

Image recognition: LLM for image recognition delivers wrong results / halucination #19298

@Adriani90

Description

@Adriani90

Steps to reproduce

  1. Open NVDA
  2. Turn on image description
  3. Navigate to e.G. Github NVDA repository
  4. Focus a link in browse mode
  5. Press nvda+g to call image description
  6. Change to focus mode
  7. Press shift+tab and tag to land on the same link
  8. Press nvda+g again
  9. Repeat nvda+g command on different links, plain text or headings, both in browse and focus mode.
  10. Repeat NVDA+g on desktop
  11. Create a blank word document and write some plain text in it (e.g. hello world)
  12. Run NVDA+g on the text, or on a blank line

Actual behavior

the image description model generates halucinations and wrong results, such as people surfing on a board, photo taken from the gorund of a cell phone, etc.
In MS Word or on desktop, it start talking about some cats, people holind an icecream, etc.

Expected behavior

NVDA image description should be applied on graphics / images only, not on links, plain texts or any html element that doesn't contain a graphic.
In the case that image description doesn't make sense, NVDA should say "not an image" or something like that.

NVDA logs, crash dumps and other attachments

nvda_image-desc.txt

System configuration

NVDA installed/portable/running from source

Installed

NVDA version

alpha-53667,bc2647d0 (2026.1.0.53667)

Windows version

Windows 11 25 H2

Name and version of other software in use when reproducing the issue

n/a

Other information about your system

Other questions

Does the issue still occur after restarting your computer?

yes

Have you tried any other versions of NVDA? If so, please report their behaviors

n/a

If NVDA add-ons are disabled, is your problem still occurring?

yes

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

yes

Metadata

Metadata

Assignees

Labels

blocked/needs-product-decisionA product decision needs to be made. Decisions about NVDA UX or supported use-cases.needs-triage

Type

No fields configured for Bug.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions