Make SAPI4 voices use WASAPI#17718
Conversation
See test results for failed build of commit 8fc0fd4c63 |
95ccfe1 to
2cc4a3e
Compare
|
This is a great work, and if we can merge this in the 2025.1 dev cycle, we don't have to announce to the community that Sapi4 will be deprecated, thus avoiding possible confusion. |
seanbudd
left a comment
There was a problem hiding this comment.
Thanks @gexgd0419! We recently decided to extend the life of SAPI 4 for the immediate future anyway, but this change is still much appreciated
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
|
Could you also remove any mention of the deprecation of SAPI4 including the warning message? |
Qchristensen
left a comment
There was a problem hiding this comment.
Great work and will be appreciated by SAPI 4 users.
|
Currently, there's the following Issue in Alpha versions. I don't know if it's known yet, because it persists in the last few versions already, so I'm pointing it out here. |
|
@hozosch |
|
Yes, but I didn't know if that was such a good idea at first. I didn't want too many unnecessary issues for one topic. But if you so wish, I'll do it. |
Summary of the issue: In #17718, I changed the SAPI4 synthesizer code, so that SAPI4 can now use WASAPI WavePlayer for audio output. This was done by creating a custom audio output destination class, and let the SAPI4 engines use the custom destination instead of the built-in `MMAudioDest` to output audio. The engines interact with the audio destination object directly, and each engine may have its own way to use the audio destination object. So while the current implementation works with some of the SAPI4 engines, it may not work well with some other engines. As different engines have different behaviors, it may be impossible to replicate the same issue with another voice. It would be better if I can get access to the exact same voice and study its behavior. But many of those voices are commercial products, which require purchasing and activating, so I cannot easily get the voices. The second option is to get the log file from the user and study the logs. While I am using logs to debug on my side, the logs may be too chatty for regular users, because every audio chunk would be logged, so I decided to leave the logging part out. But now I find it difficult to diagnose SAPI4-related problems without logs when I can't get access to the voice. Description of user facing changes When the debug log category `synthDriver` in the advanced settings page is enabled, additional verbose SAPI4 logs will be enabled, which includes most of the interactions between the SAPI4 engine and the custom audio destination object, including every audio data write. This can log many lines when using SAPI4 voices, so it's recommended to keep the log disabled when not necessary. Description of development approach Added logging in the SAPI4 module, which can be enabled or disabled by the `synthDriver` debug log category. Testing strategy: Tested manually.
Link to issue number:
None
Summary of the issue:
This PR makes the built-in SAPI4 synthesizer use WASAPI to output audio, so that old code related to WinMM can be removed entirely.
Description of user facing changes
SAPI4 voices should work as usual.
Features supported by
WavePlayer, such as audio ducking, leading silence trimming, and keeping audio device awake (#17571) will be able to work with SAPI4 voices.Description of development approach
Create a class to implement
IAudioandIAudioDest, so that it can be used as an audio output destination to replace the SAPI4 built-inMMAudioDestwhich uses WinMM.SAPI4 performs audio data output on the main thread. SAPI4 expects audio data writes to be a non-blocking operation, and it should return
AUDERR_NOTENOUGHDATAwhen the buffer is full. Unfortunately, that's not howWavePlayerworks, andWavePlayer.feedblocks the current thread until there's enough space in the buffer. So a dedicated thread is created to feed data toWavePlayer, and audio data from SAPI4 will be put in a queue first to prevent blocking the thread. Bookmarks from SAPI4 will be put in the same queue.WavePlayer.feedreturns before the audio is finished playing, butWavePlayercan only check and invoke callback functions whenWavePlayer.feedorWavePlayer.syncis called. If we just keep on waiting for the next audio chunk in the queue,WavePlayerwill have no chance to call the callback functions when there is no chunk. SoWavePlayer.feedshould be called periodically, regardless of whether there's audio or bookmark in the queue.Testing strategy:
Further tests with different SAPI4 synthesizers are needed.
Known issues with pull request:
Some SAPI4 voices do not support custom audio destinations, and this is allowed by the SAPI4 spec. They choose their own way to output the audio, and therefore bypass the WASAPI WavePlayer. In fact, they also don't use
MMAudioDestbefore this PR. This means that:Such voices have
TTSFEATURE_FIXEDAUDIOin the feature flag.Code Review Checklist:
@coderabbitai summary