Skip to content

Implementation of activations as subclasses of the torch.nn.Module #15364

@eldarkurtic

Description

@eldarkurtic

🚀 Feature request

I guess it would be best to ask this first: are there some specific reasons why activation functions in https://github.com/huggingface/transformers/blob/master/src/transformers/activations.py are not subclasses of the torch.nn.Module ?

If there are, then we can probably ignore everything else below :) .
If there aren't, then it might be interesting to consider implementing them that way (I would be happy to work on a PR for it).
A few advantages (that I'm aware of) with activations as subclasses of the torch.nn.Module:

  1. it's easy to check which activations are used in the model by just running: print(my_bert). Currently one has to check the config file for it, which is also not that bad but this just makes it a bit more convenient. Just like printing the torchvision models print(resnet50), one can immediately see which activations are being used in the model.
  2. composing layers with for example nn.Sequential would be possible (I'm not sure if this is possible when activations are implemented as python functions)
  3. attaching pytorch hooks to activation modules would be possible (I think this is the most important advantage)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions