Skip to content

Conversation

@yqzhishen
Copy link
Member

@yqzhishen yqzhishen commented Jun 12, 2023

Expressiveness controls how freely the variance model generates pitch curves. By default, the variance model predicts pitch at a 100% expressiveness, which means completely following the style of the voice provider. Correspondingly, a 0% expressiveness will produce pitch completely close to the smoothened music score. Expressiveness can be freely adjusted from 0% to 100%, statically, or even dynamically on frame level.

The mechanism of expressiveness is a trick on retake_embed. Regions where retake == 1 (100% expressiveness) will generate pitch as normal, while those where retake == 0 (0% expressiveness) will return the given base_pitch that represents the music score. When a linear fusion is applied on the two types of embeddings, we get the effects of an expressiveness curve with continuous values between 0 and 1.

@yqzhishen yqzhishen added this to the Version 2.0.0 milestone Jun 12, 2023
@yqzhishen yqzhishen changed the base branch from variance to refactor-v2 June 13, 2023 09:02
@yqzhishen yqzhishen marked this pull request as ready for review June 16, 2023 14:49
@yqzhishen yqzhishen removed this from the Version 2.0.0 milestone Jun 20, 2023
@yqzhishen yqzhishen marked this pull request as draft June 29, 2023 14:51
@yqzhishen yqzhishen changed the base branch from refactor-v2 to main July 29, 2023 14:55
# Conflicts:
#	deployment/exporters/variance_exporter.py
#	deployment/modules/toplevel.py
#	inference/ds_variance.py
#	modules/toplevel.py
@yqzhishen yqzhishen marked this pull request as ready for review August 9, 2023 18:19
@yqzhishen yqzhishen merged commit 38bc407 into main Aug 11, 2023
@yqzhishen yqzhishen deleted the expressiveness branch October 19, 2023 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants