Control first downsample stride in ResNet by jiqing-feng · Pull Request #26374 · huggingface/transformers

jiqing-feng · 2023-09-25T03:25:49Z

Relate to 25856. I added a new parameter in config to control the stride for the first bottleneck layer in stages. Would you please help me review it? Thx!

rafaelpadilla

A new configuration parameter is being suggested in the ResNet config to attend the needs of a new model. It was done to preserve backward compatibility by default. :)

However, the name of the suggested config parameter should be more intuitive to leverage other models. Could you please, make this change?

src/transformers/models/resnet/configuration_resnet.py

ArthurZucker

Thanks for opening the follow up PR 🤗

ArthurZucker · 2023-09-28T09:29:08Z

src/transformers/models/resnet/modeling_resnet.py

+        if config.layer_type == "bottleneck":
+            self.layers = nn.Sequential(
+                # downsampling is done in the first layer with stride of 2
+                layer(
+                    in_channels,
+                    out_channels,
+                    stride=stride,
+                    activation=config.hidden_act,
+                    reduce_first=config.reduce_first,
+                ),
+                *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
+            )
+        else:
+            self.layers = nn.Sequential(
+                # downsampling is done in the first layer with stride of 2
+                layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
+                *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
+            )


this can be simplified 😉

Suggested change

if config.layer_type == "bottleneck":

self.layers = nn.Sequential(

# downsampling is done in the first layer with stride of 2

layer(

in_channels,

out_channels,

stride=stride,

activation=config.hidden_act,

reduce_first=config.reduce_first,

),

*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],

)

else:

self.layers = nn.Sequential(

# downsampling is done in the first layer with stride of 2

layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),

*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],

)

reduce_first = config.reduce_first if config.layer_type == "bottleneck" else False

self.layers = nn.Sequential(

# downsampling is done in the first layer with stride of 2

layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),

*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],

)

or something like this

jiqing-feng · 2023-10-02T08:33:40Z

Hi @ArthurZucker @rafaelpadilla .

Thanks for your advice. The resnet backbone which downsamples in bottleneck layer can be found in resnet.

rafaelpadilla

Looks good to me.
But including docstring in the constructors of ResNetBottleNeckLayer and ResNetBasicLayer seems important to me, as the new parameter downsample_in_bottleneck is introduced.

src/transformers/models/resnet/modeling_resnet.py

rafaelpadilla · 2023-10-03T10:13:04Z

src/transformers/models/resnet/modeling_resnet.py

+        out_channels: int,
+        stride: int = 1,
+        activation: str = "relu",
+        downsample_in_bottleneck: bool = False,


I see the purpose of downsample_in_bottleneck here: it prevents one from distinguishing between ResNetBottleNeckLayer and ResNetBasicLayer. But it is not being use in this class.
So, I think it is valuable to include a docstring here, and clarify why downsample_in_bottleneck isn't used.

Actually let's not pollute this class with this, the if else logic should be done to have the correct args.

Agree! It's cleaner.

ArthurZucker

LGTM appart for the arg that is not used. Let's have explicit and useful args, doing a if else is okay to chose the layer and more understandable

ArthurZucker · 2023-10-04T12:27:46Z

src/transformers/models/resnet/modeling_resnet.py

+        out_channels: int,
+        stride: int = 1,
+        activation: str = "relu",
+        downsample_in_bottleneck: bool = False,


Actually let's not pollute this class with this, the if else logic should be done to have the correct args.

HuggingFaceDocBuilderDev · 2023-10-04T12:41:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

jiqing-feng · 2023-10-04T14:15:42Z

LGTM appart for the arg that is not used. Let's have explicit and useful args, doing a if else is okay to chose the layer and more understandable

Sure!

ArthurZucker

Sorry, this should be the last iteration!

ArthurZucker · 2023-10-04T16:38:35Z

src/transformers/models/resnet/modeling_resnet.py

+        if config.layer_type == "bottleneck":
+            self.layers = nn.Sequential(
+                # downsampling is done in the first layer with stride of 2
+                layer(
+                    in_channels,
+                    out_channels,
+                    stride=stride,
+                    activation=config.hidden_act,
+                    downsample_in_bottleneck=config.downsample_in_bottleneck,
+                ),
+                *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
+            )
+        else:
+            self.layers = nn.Sequential(
+                # downsampling is done in the first layer with stride of 2
+                layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
+                *[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
+            )


No as I said before the only layer that is different here is the call. Something like

Suggested change

if config.layer_type == "bottleneck":

self.layers = nn.Sequential(

# downsampling is done in the first layer with stride of 2

layer(

in_channels,

out_channels,

stride=stride,

activation=config.hidden_act,

downsample_in_bottleneck=config.downsample_in_bottleneck,

),

*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],

)

else:

self.layers = nn.Sequential(

# downsampling is done in the first layer with stride of 2

layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),

*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],

)

if config.layer_type == "bottleneck":

first_layer = layer(in_channels,out_channels,stride=stride,activation=config.hidden_act,downsample_in_bottleneck=config.downsample_in_bottleneck)

else:

first_layer = layer(in_channels, out_channels, stride=stride, activation=config.hidden_act)

self.layers = nn.Sequential(first_layer,*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)])

something like this should be enough.

ArthurZucker

thanks for this round! merging 😉

Control first downsample stride in ResNet (huggingface#26374)

control first downsample stride

b819daa

jiqing-feng changed the title ~~control first downsample stride~~ Control first downsample stride in ResNet Sep 25, 2023

reduce first only works for ResNetBottleNeckLayer

fe9712d

jiqing-feng mentioned this pull request Sep 27, 2023

TVP model #25856

Merged

ArthurZucker requested a review from rafaelpadilla September 28, 2023 09:31

rafaelpadilla suggested changes Sep 28, 2023

View reviewed changes

src/transformers/models/resnet/configuration_resnet.py Outdated Show resolved Hide resolved

ArthurZucker reviewed Sep 29, 2023

View reviewed changes

jiqing-feng requested a review from rafaelpadilla October 2, 2023 08:34

rafaelpadilla approved these changes Oct 3, 2023

View reviewed changes

ArthurZucker reviewed Oct 4, 2023

View reviewed changes

fix param name

fc69b20

jiqing-feng force-pushed the resnet branch from 9d406ca to fc69b20 Compare October 4, 2023 13:56

jiqing-feng requested a review from ArthurZucker October 4, 2023 14:14

ArthurZucker reviewed Oct 4, 2023

View reviewed changes

fix style

682f271

jiqing-feng requested a review from ArthurZucker October 5, 2023 02:17

ArthurZucker approved these changes Oct 10, 2023

View reviewed changes

ArthurZucker merged commit 592f2ea into huggingface:main Oct 10, 2023

Keracles added a commit to Keracles/transformers that referenced this pull request Oct 10, 2023

Merge pull request #1 from huggingface/main

3f87eb8

Control first downsample stride in ResNet (huggingface#26374)

jiqing-feng deleted the resnet branch November 22, 2023 01:48

Conversation

jiqing-feng commented Sep 25, 2023

Uh oh!

rafaelpadilla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jiqing-feng commented Oct 2, 2023

Uh oh!

rafaelpadilla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 4, 2023

Uh oh!

jiqing-feng commented Oct 4, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants