Conversation
|
cc : @sanchit-gandhi @dg845 |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
Hi, I pulled the (When I click the "Open pull request" button on GitHub, I'm not able to specify |
|
Hi @dg845, I just invited you to collaborate on my Since I gave you access to my repo, we can now both push changes to Please let me know if this is working or not! |
|
I have tested and added the resnet block, tomorrow will do the Attention block @dg845. Let's first add all the modules(those that you and I specified in slack) and create a basic diagram until the CLVP and Univnet is merged after that we can transfer the weights and finalize the whole model! |
…e resnet block to modeling_tortoise_tts.py.
Looks like it is working, was able to push a commit :). |
|
I have moved the resnet block code to |
…conditioning embeddings when no conditioning audio is supplied.
|
Just started the part of checkpoint conversion script for the diffusion decoder model, will also add the clvp model conversion script later(btw the weights loading code is unfinished and a pure mess, will update it in the next commit), also it seems that you have done a lot of work here! I need to catch up. |
…ive modeling, and diffusion modeling.
|
Just a heads up that I have refactored
The code can probably be simplified further but I think this makes sense for now. |
|
The diffusion decoder attention outputs are same now, the whole decoder model will probably be ready in next 1 or 2 days since the Resnet outputs are already verified to the official repo. Also should we not place Please let me know what you think. |
|
I guess one difference is that when [For context, the only place |
…toregressive audio candidates into its own method.
…mming logic in tortoise-tts.
…s worse otherwise pretty good
|
Hey @susnato , I see that you are really active in integrating Tortoise! |
|
Hi @ylacombe, sure I till let you know once it's finished, and sorry it's taking soo long. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
Can i work on this to add tortoise in Diffusers. I see that tortoise TTS is very powerful but it's very slow. |
|
Hey @tuanh123789, let's first ping @susnato to make sure he doesn't want or have bandwidth to finish this PR! @susnato, let us know ! You've already make a great effort on this PR, would you like to finish it up? Thanks! |
|
Hello, I am so sorry to everyone that I couldn't finish this PR 😢 . @tuanh123789 Please feel free to take this up, and as far as I remember me and @dg845 have already impelemented CLVP and UnivNet vocoder and I was able to get the logits from start to vocoder within 2e-2 atol (within acceptable range of diffusers as far as I am aware), so it needs few more work to make it e2e compatible. I can also invite you to my diffusers branch so that you can continue the work from there (if you want of course) Also maybe @ylacombe you could invite @tuanh123789 to our shared slack channel (if possible) so that he could get more idea of the current state and issues that we were facing. |
|
@tuanh123789 Let me know if you need any more pointers, I will try to answer as much as I can! |
|
Thanks for the update @susnato! @tuanh123789 feel free to reach out on X or LI to get you on the channel! |
Sure, pls add me to your branch |
|
Just did @tuanh123789 ! You should see a message in your mail. |
|
@susnato, @tuanh123789 |
What does this PR do?
Adds Tortoise TTS Pipeline and Fixes #3891
Before Adding this pipeline, we need to make sure these two PR's are merged -
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.