Skip to content

[VITS] Fix nightly tests#25986

Merged
sanchit-gandhi merged 8 commits intohuggingface:mainfrom
sanchit-gandhi:vits-tests
Sep 7, 2023
Merged

[VITS] Fix nightly tests#25986
sanchit-gandhi merged 8 commits intohuggingface:mainfrom
sanchit-gandhi:vits-tests

Conversation

@sanchit-gandhi
Copy link
Contributor

@sanchit-gandhi sanchit-gandhi commented Sep 5, 2023

What does this PR do?

Fixes the tokenizer integration test and multi-GPU test that failed on the nightly run: https://github.com/huggingface/transformers/actions/runs/6068405445/job/16461410319

The tokenizer fix is trivial (needed to update the commit ID)!

The multi-GPU test was failing because the output sequence length for VITS is a function of the model inputs, rather than being a function of the input sequence lengths only.

Let's say we have 2 GPUs over which we want to run DP:

  • GPU 1 outputs a sequence length of N, which is computed based on the input in the first element of the batch x
  • GPU 2 outputs a sequence length of M, which is computed based on the input in the second element of the batch y

=> there is nothing to enforce that N = M, since the VITS output sequence length is a function of the inputs. Thus, we cannot concatenate the inputs after running the forward pass, since they have different dims.

# pseudo code for data parallelism
input_1, input_2 = torch.split(input, 2, dim=0)

output_1 = model(input_1) 
output_2 = model(input_2)

output = torch.concatenate([output_1, output_2], dim=0)  # breaks because input_1 and input_2 have different sequence lengths

The fix for the test is to pass the same inputs to both GPUs, and disable the stochastic duration predictor. This way, we get consistent outputs across our GPUs.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 5, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. Just a few nits for the comments.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The tokenizers integration tests should rather use k as padding as we discussed!

@sanchit-gandhi sanchit-gandhi merged commit 2af87d0 into huggingface:main Sep 7, 2023
@sanchit-gandhi sanchit-gandhi deleted the vits-tests branch September 7, 2023 16:49
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
* fix tokenizer

* make bs even

* fix multi gpu test

* style

* model forward

* fix torch import

* revert tok pin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants