-
Notifications
You must be signed in to change notification settings - Fork 296
[feat][OSS] Add clip_grad_value_ function. #308
Copy link
Copy link
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
🚀 Feature
There are two popular gradient clipping methods: one that limits the maximum gradient value of each model parameter and the other one that scales the gradient value based on the p-norm of a (sub-)set of model parameters.
clip_grad_norm(the second one) is useful when the norm of gradients is large, but not when only a small sub-set of model parameters have abnormal gradient values since the norm will still be reasonably small considering the number of all model parameters.
Related PR (Pytorch Lightning): Lightning-AI/pytorch-lightning#5477
How about to add the following function to optim/oss.py
def clip_grad_value(
self,
clip_value: Union[float, int],
filter_params_fn: Callable[[Any], Any] = None,
) -> None:I want to call function in PL's sharded_native_amp_plugin.py.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request