feat: Use NVIDIA NIM ASR for audio transcription#53
Conversation
|
I decided to not add this because it adds an additional dependency and server outside of uv which is a hassle for users. |
This is an optional dependecy, I need to update .toml to not be in voice but like voice_nim, not a new required dependency. |
|
Ok sounds good as long as it can be with uv sync --voice and doesn't require launching an extra server. You can do it. |
Isn't it better to use something like |
|
Yes, that's better |
|
Fixed in b3d815c by using newer versions of grpcio then nvidia-riva-client initially offered |
|
Is it ready to be merged? |
Not yet, I still need to fix smth in transcription, I will mark as ready when I'm ready :) |
|
@Alishahryar1 It's ready for review 😉 |
|
Some ci checks failing you can run those locally as well. It's best practice to run all checks locally before pushing. |
… throw a name error
|
LGTM. Great work! |
## Summary Added NVIDIA NIM as a second transcription option ( alongside local Whisper). This lets you transcribe voice notes using NVIDIA's cloud API instead of running Whisper locally. ## What changed - **Transcription**: Now supports the two backends - Local Whisper: Free, runs on your GPU/CPU (existing) - NVIDIA NIM: Cloud API via Riva gRPC (new) - **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for different languages, Whisper Large V3) --------- Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
## Summary Added NVIDIA NIM as a second transcription option ( alongside local Whisper). This lets you transcribe voice notes using NVIDIA's cloud API instead of running Whisper locally. ## What changed - **Transcription**: Now supports the two backends - Local Whisper: Free, runs on your GPU/CPU (existing) - NVIDIA NIM: Cloud API via Riva gRPC (new) - **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for different languages, Whisper Large V3) --------- Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
## Summary Added NVIDIA NIM as a second transcription option ( alongside local Whisper). This lets you transcribe voice notes using NVIDIA's cloud API instead of running Whisper locally. ## What changed - **Transcription**: Now supports the two backends - Local Whisper: Free, runs on your GPU/CPU (existing) - NVIDIA NIM: Cloud API via Riva gRPC (new) - **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for different languages, Whisper Large V3) --------- Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
Summary
Added NVIDIA NIM as a second transcription option ( alongside local Whisper). This lets you transcribe voice notes using NVIDIA's cloud API instead of running Whisper locally.
What changed
Transcription: Now supports the two backends
Supported models: 8 NVIDIA NIM models added (Parakeet variants for different languages, Whisper Large V3)