Refactor S3D model by sophiazhi · Pull Request #135 · facebookresearch/multimodal

sophiazhi · 2022-07-06T03:32:23Z

Summary:
Followup to #104: Refactor S3D model to reduce code repetition

Test plan:
python -m pytest examples/mugen/test/retrieval/test_s3d.py -rP
Coverage:

examples/mugen/test/retrieval/test_s3d.py::TestS3D::test_forward PASSED                     [ 25%]
examples/mugen/test/retrieval/test_s3d.py::TestS3D::test_basicconv3d PASSED                 [ 50%]
examples/mugen/test/retrieval/test_s3d.py::TestS3D::test_sepconv3d PASSED                   [ 75%]
examples/mugen/test/retrieval/test_s3d.py::TestS3D::test_mixedconvsblock PASSED             [100%]

Name                                     Stmts   Miss  Cover
examples/mugen/retrieval/s3d.py             66      0   100%
examples/mugen/retrieval/video_clip.py      28     28     0%
TOTAL                                       94     28    70%


======================================== 4 passed in 5.03s ========================================

codecov-commenter · 2022-07-06T03:34:44Z

Codecov Report

Merging #135 (2a27e71) into main (855c9ed) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #135   +/-   ##
=======================================
  Coverage   90.67%   90.67%           
=======================================
  Files          46       46           
  Lines        2403     2403           
=======================================
  Hits         2179     2179           
  Misses        224      224

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 855c9ed...2a27e71. Read the comment docs.

ebsmothers

Looks great! Thanks for refactoring these classes, the way you're doing it now is really nice. Left a few nits but otherwise looks good.

examples/mugen/retrieval/s3d.py

ebsmothers · 2022-07-06T17:07:53Z

examples/mugen/retrieval/s3d.py

-            BasicConv3d(480, 96, kernel_size=1, stride=1),
-            SepConv3d(96, 208, kernel_size=3, stride=1, padding=1),
+class Branch1or2(nn.Sequential):
+    def __init__(self, in_planes, mid_planes, out_planes):


nit: add type info (here and in other classes). Mypy won't complain about examples/ so it's not a hard requirement, but still will make the code clearer.

langong347 · 2022-07-08T00:05:06Z

@YosuaMichael Would love to get your eyes on this. This way of implementing inception blocks is different from torchvision. This version of using S3D as the backbone may as well be a good addition to TorchVision' inception class in the future.

YosuaMichael

Hi @sophiazhi , thanks the PR! I have a few comments on making more general class for S3D and some suggestions for the implementation and naming.
Also similar with @ebsmothers suggestion, it would be great to have type info for public class constructor and forward.

examples/mugen/retrieval/s3d.py

YosuaMichael · 2022-07-08T11:06:30Z

examples/mugen/retrieval/s3d.py

@@ -30,17 +30,17 @@ def __init__(self, num_class):
            BasicConv3d(64, 64, kernel_size=1, stride=1),


I think we can make a more generic class that able to handle both S3D and I3D (see https://arxiv.org/pdf/1705.07750.pdf). The idea (just to illustrate, the naming might not be good) is to enable user to choose the class for the conv_block (S3D: SepConv3D, I3D: Conv3D), max_pool_block, and inception_block (S3D: SepInceptionModule3D, I3D: InceptionModule3D) (examples: here or here).

It is also good to let the parameters also be configurable on the generic class, this will be useful if there will be a new variant with different parameters in the future. This is not a must, however since there are a lot of parameters, it might be a good idea to create a config class to store the parameters (example).

Once we have the generic module, we can have a builder to build the s3d model (example)

examples/mugen/retrieval/s3d.py

YosuaMichael · 2022-07-08T16:13:55Z

I also want to suggest to reuse Conv3dNormActivation from torchvision. I think this can be used to replace BasicConv3d by using

Conv3dNormActivation(64, 64, kernel_size=1, stride=1, norm_layer=partial(nn.BatchNorm3d, eps=1e-3, momentum=1e-3))

Also SepConv3d can also be simplified:

class SepConv3d(nn.Sequential):
    def __init__(self, in_planes, out_planes, kernel_size, stride, padding=0):
        super(SepConv3d, self).__init__(
            Conv3dNormActivation(
                in_planes,
                out_planes,
                kernel_size=(1, kernel_size, kernel_size),
                stride=(1, stride, stride),
                padding=(0, padding, padding),
                bias=False,
            ),
            Conv3dNormActivation(
                out_planes,
                out_planes,
                kernel_size=(kernel_size, 1, 1),
                stride=(stride, 1, 1),
                padding=(padding, 0, 0),
                bias=False,
            ),
        )

Another thing is that the padding is usually constructor with common formula of (kernel_size - 1) // 2 * dilation and I observe that it is the case here too. In this case, you can also not passing the padding manually :)

Take note that I have just updated Conv3dNormActivation so it can accept a tuple kernel_size. So in case you want to use it, please either use install torchvision from source or wait for tomorrow nightly version.

datumbox

@sophiazhi In continuation to our discussion at pytorch/vision#6402, I think this implementation is quite good and worth upstreaming to TorchVision.

I've added a few comments to indicate parts that would probably need to change to make it easier to upstream. My comments are minor and don't really affect your implementation; rather just suggest specific idioms that TorchVision uses.

There is a bit of boilerplate code that would also be required for creating model builder methods and weights (we will need to port the paper weights) but again nothing too much. We can guide you through the process during the PR. My recommendation is to open one on the repo and I can guide you from there.

datumbox · 2022-08-12T07:43:51Z

examples/mugen/retrieval/s3d.py

+class Branch0(nn.Sequential):
+    def __init__(self, in_planes, out_planes):
+        super(Branch0, self).__init__(
+            BasicConv3d(in_planes, out_planes, kernel_size=1, stride=1)


TorchVision has a standard block for BasicConv3d. It's called Conv3dNormActivation.

SepConv3d can also be rewritten based on it. Not sure if we should call them separable conv because there are normalization and activation layers in between them but we can review naming later.

datumbox · 2022-08-12T07:45:32Z

examples/mugen/retrieval/s3d.py

+        self.branch0 = Branch0(in_planes, b0_out)
+        self.branch1 = Branch1or2(in_planes, b1_mid, b1_out)
+        self.branch2 = Branch1or2(in_planes, b2_mid, b2_out)
+        self.branch3 = Branch3(in_planes, b3_out)


Can we avoid the introduction of the Branch* classes? From what I understand, we can just use Sequential directly and simplify the code structure.

datumbox · 2022-08-12T07:49:25Z

examples/mugen/retrieval/s3d.py

-            Mixed5b(),
-            Mixed5c(),
+            MixedConvsBlock(832, 256, 160, 320, 32, 128, 128),
+            MixedConvsBlock(832, 384, 192, 384, 48, 128, 128),


Sorry putting a comment on a random position because Github doesn't let me comment on the constructor of the S3D class. We need to align the arguments you pass to make them similar to what TorchVision is currently using (see Video Resnet class, nothing to different).

sophiazhi · 2022-08-18T16:19:44Z

Closing due to pytorch/vision#6412

sophiazhi added 2 commits July 6, 2022 03:24

Refactor S3D model

77ec5d8

Merge remote-tracking branch 'upstream/main' into szhi-s3d_refactor

2a27e71

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 6, 2022

sophiazhi requested a review from ebsmothers July 6, 2022 03:35

sophiazhi marked this pull request as ready for review July 6, 2022 03:36

ebsmothers approved these changes Jul 6, 2022

View reviewed changes

sophiazhi requested a review from YosuaMichael July 6, 2022 18:17

YosuaMichael reviewed Jul 8, 2022

View reviewed changes

sophiazhi mentioned this pull request Aug 11, 2022

S3D feature request pytorch/vision#6402

Closed

datumbox reviewed Aug 12, 2022

View reviewed changes

sophiazhi closed this Aug 18, 2022

		@@ -30,17 +30,17 @@ def __init__(self, num_class):
		BasicConv3d(64, 64, kernel_size=1, stride=1),

Conversation

sophiazhi commented Jul 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jul 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ebsmothers left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ebsmothers Jul 6, 2022

Choose a reason for hiding this comment

Uh oh!

langong347 commented Jul 8, 2022

Uh oh!

YosuaMichael left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

YosuaMichael Jul 8, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

YosuaMichael commented Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datumbox left a comment

Choose a reason for hiding this comment

Uh oh!

datumbox Aug 12, 2022

Choose a reason for hiding this comment

Uh oh!

datumbox Aug 12, 2022

Choose a reason for hiding this comment

Uh oh!

datumbox Aug 12, 2022

Choose a reason for hiding this comment

Uh oh!

sophiazhi commented Aug 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

sophiazhi commented Jul 6, 2022 •

edited

Loading

codecov-commenter commented Jul 6, 2022 •

edited

Loading

YosuaMichael left a comment •

edited

Loading

YosuaMichael commented Jul 8, 2022 •

edited

Loading