Conversation
Codecov Report
@@ Coverage Diff @@
## main #135 +/- ##
=======================================
Coverage 90.67% 90.67%
=======================================
Files 46 46
Lines 2403 2403
=======================================
Hits 2179 2179
Misses 224 224 Continue to review full report at Codecov.
|
ebsmothers
left a comment
There was a problem hiding this comment.
Looks great! Thanks for refactoring these classes, the way you're doing it now is really nice. Left a few nits but otherwise looks good.
| BasicConv3d(480, 96, kernel_size=1, stride=1), | ||
| SepConv3d(96, 208, kernel_size=3, stride=1, padding=1), | ||
| class Branch1or2(nn.Sequential): | ||
| def __init__(self, in_planes, mid_planes, out_planes): |
There was a problem hiding this comment.
nit: add type info (here and in other classes). Mypy won't complain about examples/ so it's not a hard requirement, but still will make the code clearer.
|
@YosuaMichael Would love to get your eyes on this. This way of implementing inception blocks is different from torchvision. This version of using S3D as the backbone may as well be a good addition to TorchVision' inception class in the future. |
There was a problem hiding this comment.
Hi @sophiazhi , thanks the PR! I have a few comments on making more general class for S3D and some suggestions for the implementation and naming.
Also similar with @ebsmothers suggestion, it would be great to have type info for public class constructor and forward.
| @@ -30,17 +30,17 @@ def __init__(self, num_class): | |||
| BasicConv3d(64, 64, kernel_size=1, stride=1), | |||
There was a problem hiding this comment.
I think we can make a more generic class that able to handle both S3D and I3D (see https://arxiv.org/pdf/1705.07750.pdf). The idea (just to illustrate, the naming might not be good) is to enable user to choose the class for the conv_block (S3D: SepConv3D, I3D: Conv3D), max_pool_block, and inception_block (S3D: SepInceptionModule3D, I3D: InceptionModule3D) (examples: here or here).
It is also good to let the parameters also be configurable on the generic class, this will be useful if there will be a new variant with different parameters in the future. This is not a must, however since there are a lot of parameters, it might be a good idea to create a config class to store the parameters (example).
Once we have the generic module, we can have a builder to build the s3d model (example)
|
I also want to suggest to reuse Conv3dNormActivation from torchvision. I think this can be used to replace Also Another thing is that the Take note that I have just updated |
datumbox
left a comment
There was a problem hiding this comment.
@sophiazhi In continuation to our discussion at pytorch/vision#6402, I think this implementation is quite good and worth upstreaming to TorchVision.
I've added a few comments to indicate parts that would probably need to change to make it easier to upstream. My comments are minor and don't really affect your implementation; rather just suggest specific idioms that TorchVision uses.
There is a bit of boilerplate code that would also be required for creating model builder methods and weights (we will need to port the paper weights) but again nothing too much. We can guide you through the process during the PR. My recommendation is to open one on the repo and I can guide you from there.
| class Branch0(nn.Sequential): | ||
| def __init__(self, in_planes, out_planes): | ||
| super(Branch0, self).__init__( | ||
| BasicConv3d(in_planes, out_planes, kernel_size=1, stride=1) |
There was a problem hiding this comment.
TorchVision has a standard block for BasicConv3d. It's called Conv3dNormActivation.
SepConv3d can also be rewritten based on it. Not sure if we should call them separable conv because there are normalization and activation layers in between them but we can review naming later.
| self.branch0 = Branch0(in_planes, b0_out) | ||
| self.branch1 = Branch1or2(in_planes, b1_mid, b1_out) | ||
| self.branch2 = Branch1or2(in_planes, b2_mid, b2_out) | ||
| self.branch3 = Branch3(in_planes, b3_out) |
There was a problem hiding this comment.
Can we avoid the introduction of the Branch* classes? From what I understand, we can just use Sequential directly and simplify the code structure.
| Mixed5b(), | ||
| Mixed5c(), | ||
| MixedConvsBlock(832, 256, 160, 320, 32, 128, 128), | ||
| MixedConvsBlock(832, 384, 192, 384, 48, 128, 128), |
There was a problem hiding this comment.
Sorry putting a comment on a random position because Github doesn't let me comment on the constructor of the S3D class. We need to align the arguments you pass to make them similar to what TorchVision is currently using (see Video Resnet class, nothing to different).
|
Closing due to pytorch/vision#6412 |
Summary:
Followup to #104: Refactor S3D model to reduce code repetition
Test plan:
python -m pytest examples/mugen/test/retrieval/test_s3d.py -rPCoverage: