[Feature] Detect anomalous parameters#1547
Conversation
|
Ping @ZwwWayne |
|
After resolving the above comments and the CI passes, the PR can be merged. |
|
Can we check whether it slows down the training speed if we set detect_anomalous_params=True? |
It probably slows down the training speed but the flag should only be opened for debugging. |
|
Please merge the upstream master branch to this PR which will resolve the error when building documentation. |
It is only used for debugging |
I will add a notice in the docstr to indicate this is only used for debugging |
Hi @jshilong , we will merge the PR after you add a notice in the docstring. |
Done |
Motivation
Sometimes there are some anomalous parameters in the training phase, mainly two cases
loss.backward()loss.backward()In both cases,anomalous parameters are not included in the computational graph that is with
lossas the root.Modification
I add a debug option named
detect_anomalous_paramstoOptimizerHook, which can help you find anomalous parametersBC-breaking (Optional)
None