Control first downsample stride in ResNet#26374
Control first downsample stride in ResNet#26374ArthurZucker merged 4 commits intohuggingface:mainfrom
Conversation
rafaelpadilla
left a comment
There was a problem hiding this comment.
A new configuration parameter is being suggested in the ResNet config to attend the needs of a new model. It was done to preserve backward compatibility by default. :)
However, the name of the suggested config parameter should be more intuitive to leverage other models. Could you please, make this change?
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks for opening the follow up PR 🤗
| if config.layer_type == "bottleneck": | ||
| self.layers = nn.Sequential( | ||
| # downsampling is done in the first layer with stride of 2 | ||
| layer( | ||
| in_channels, | ||
| out_channels, | ||
| stride=stride, | ||
| activation=config.hidden_act, | ||
| reduce_first=config.reduce_first, | ||
| ), | ||
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | ||
| ) | ||
| else: | ||
| self.layers = nn.Sequential( | ||
| # downsampling is done in the first layer with stride of 2 | ||
| layer(in_channels, out_channels, stride=stride, activation=config.hidden_act), | ||
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | ||
| ) |
There was a problem hiding this comment.
this can be simplified 😉
| if config.layer_type == "bottleneck": | |
| self.layers = nn.Sequential( | |
| # downsampling is done in the first layer with stride of 2 | |
| layer( | |
| in_channels, | |
| out_channels, | |
| stride=stride, | |
| activation=config.hidden_act, | |
| reduce_first=config.reduce_first, | |
| ), | |
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | |
| ) | |
| else: | |
| self.layers = nn.Sequential( | |
| # downsampling is done in the first layer with stride of 2 | |
| layer(in_channels, out_channels, stride=stride, activation=config.hidden_act), | |
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | |
| ) | |
| reduce_first = config.reduce_first if config.layer_type == "bottleneck" else False | |
| self.layers = nn.Sequential( | |
| # downsampling is done in the first layer with stride of 2 | |
| layer(in_channels, out_channels, stride=stride, activation=config.hidden_act), | |
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | |
| ) |
or something like this
|
Hi @ArthurZucker @rafaelpadilla . Thanks for your advice. The resnet backbone which downsamples in bottleneck layer can be found in resnet. |
rafaelpadilla
left a comment
There was a problem hiding this comment.
Looks good to me.
But including docstring in the constructors of ResNetBottleNeckLayer and ResNetBasicLayer seems important to me, as the new parameter downsample_in_bottleneck is introduced.
| out_channels: int, | ||
| stride: int = 1, | ||
| activation: str = "relu", | ||
| downsample_in_bottleneck: bool = False, |
There was a problem hiding this comment.
I see the purpose of downsample_in_bottleneck here: it prevents one from distinguishing between ResNetBottleNeckLayer and ResNetBasicLayer. But it is not being use in this class.
So, I think it is valuable to include a docstring here, and clarify why downsample_in_bottleneck isn't used.
There was a problem hiding this comment.
Actually let's not pollute this class with this, the if else logic should be done to have the correct args.
ArthurZucker
left a comment
There was a problem hiding this comment.
LGTM appart for the arg that is not used. Let's have explicit and useful args, doing a if else is okay to chose the layer and more understandable
| out_channels: int, | ||
| stride: int = 1, | ||
| activation: str = "relu", | ||
| downsample_in_bottleneck: bool = False, |
There was a problem hiding this comment.
Actually let's not pollute this class with this, the if else logic should be done to have the correct args.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Sure! |
ArthurZucker
left a comment
There was a problem hiding this comment.
Sorry, this should be the last iteration!
| if config.layer_type == "bottleneck": | ||
| self.layers = nn.Sequential( | ||
| # downsampling is done in the first layer with stride of 2 | ||
| layer( | ||
| in_channels, | ||
| out_channels, | ||
| stride=stride, | ||
| activation=config.hidden_act, | ||
| downsample_in_bottleneck=config.downsample_in_bottleneck, | ||
| ), | ||
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | ||
| ) | ||
| else: | ||
| self.layers = nn.Sequential( | ||
| # downsampling is done in the first layer with stride of 2 | ||
| layer(in_channels, out_channels, stride=stride, activation=config.hidden_act), | ||
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | ||
| ) |
There was a problem hiding this comment.
No as I said before the only layer that is different here is the call. Something like
| if config.layer_type == "bottleneck": | |
| self.layers = nn.Sequential( | |
| # downsampling is done in the first layer with stride of 2 | |
| layer( | |
| in_channels, | |
| out_channels, | |
| stride=stride, | |
| activation=config.hidden_act, | |
| downsample_in_bottleneck=config.downsample_in_bottleneck, | |
| ), | |
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | |
| ) | |
| else: | |
| self.layers = nn.Sequential( | |
| # downsampling is done in the first layer with stride of 2 | |
| layer(in_channels, out_channels, stride=stride, activation=config.hidden_act), | |
| *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)], | |
| ) | |
| if config.layer_type == "bottleneck": | |
| first_layer = layer(in_channels,out_channels,stride=stride,activation=config.hidden_act,downsample_in_bottleneck=config.downsample_in_bottleneck) | |
| else: | |
| first_layer = layer(in_channels, out_channels, stride=stride, activation=config.hidden_act) | |
| self.layers = nn.Sequential(first_layer,*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)]) |
There was a problem hiding this comment.
something like this should be enough.
ArthurZucker
left a comment
There was a problem hiding this comment.
thanks for this round! merging 😉
Control first downsample stride in ResNet (huggingface#26374)
Hi @ArthurZucker
Relate to 25856. I added a new parameter in config to control the stride for the first bottleneck layer in stages. Would you please help me review it? Thx!