-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Closed
Description
🚀 Feature request
I guess it would be best to ask this first: are there some specific reasons why activation functions in https://github.com/huggingface/transformers/blob/master/src/transformers/activations.py are not subclasses of the torch.nn.Module ?
If there are, then we can probably ignore everything else below :) .
If there aren't, then it might be interesting to consider implementing them that way (I would be happy to work on a PR for it).
A few advantages (that I'm aware of) with activations as subclasses of the torch.nn.Module:
- it's easy to check which activations are used in the model by just running:
print(my_bert). Currently one has to check the config file for it, which is also not that bad but this just makes it a bit more convenient. Just like printing the torchvision modelsprint(resnet50), one can immediately see which activations are being used in the model. - composing layers with for example
nn.Sequentialwould be possible (I'm not sure if this is possible when activations are implemented as python functions) - attaching pytorch hooks to activation modules would be possible (I think this is the most important advantage)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels