Skip to content

Conversation

@KakaruHayate
Copy link

当多卡训练时(DDP),会提示需要设定find_unused_parameters=True,这是因为DDP下要求所有可学习参数均在前向过程中

但当设定find_unused_parameters=True后,会提示不需要find_unused_parameters=True

这是因为使用的RoPE实现包含的cache机制,当should_cache时,self.freqs并不包含在前向过程中

@yqzhishen yqzhishen merged commit 76620b3 into openvpi:main Apr 15, 2025
agentasteriski added a commit to agentasteriski/DiffSinger that referenced this pull request Apr 16, 2025
agentasteriski added a commit to agentasteriski/DiffSinger that referenced this pull request Apr 16, 2025
agentasteriski added a commit to agentasteriski/DiffSinger that referenced this pull request Apr 16, 2025
@KakaruHayate KakaruHayate deleted the patch-1 branch April 16, 2025 02:49
@KakaruHayate KakaruHayate restored the patch-1 branch April 16, 2025 03:37
yxlllc pushed a commit that referenced this pull request Aug 16, 2025
#244)

* Fix issue about 'find_unused_parameters' when DDP training.

* annotation

* slim

* Fix issue about 'find_unused_parameters' when DDP training.

annotation

slim

* Update rotary_embedding_torch.py
@KakaruHayate KakaruHayate deleted the patch-1 branch October 15, 2025 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants