AI Image descriptions progress #19807
Replies: 3 comments 7 replies
-
|
What do you mean by a high-quality graphics card? For example, if I bought an Ultrabook with an NPU but only an integrated graphics card, would I still be unable to benefit from high-quality AI? |
Beta Was this translation helpful? Give feedback.
-
|
I have already implemented multi-model support in the add-on using a model manager, and have experimentally added support for the Qwen 3.5 series ONNX models. The 0.8B model can produce results in about one minute with reasonably good quality, and it supports multiple languages using preset prompt Localized string (currently Chinese and English). It is available for manual download here Regarding hardware acceleration, I have also made some attempts. However, I was not able to successfully integrate DirectML acceleration. It seems that DirectML has compatibility issues with certain model operators, and resolving them turned out to be more complex than expected. |
Beta Was this translation helpful? Give feedback.
-
|
Also Mozilla Lllamafile project could be interesting for this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The latest good NVDA installer of AI image descriptions can be found here.
After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1.
This is due to these main reasons:
To reintroduce the feature in alpha, we want to fix the following things first:
So far we have done this, by recently improving the model slightly.
The next biggest priorities are:
After:
A big technical challenge here is the lag importing numpy introduces, which the python onnxruntime requires.
We investigated creating a C++ layer, but the implementation is still experimental and not working for ARM64EC: microsoft/onnxruntime#15403
We could consider offloading onnxruntime, numpy and the describer to a separate process, similar to the 32bit shim.
Related issues and PRs:
Beta Was this translation helpful? Give feedback.
All reactions