[Feature] Support Segmenter#952
[Feature] Support Segmenter#952rstrudel wants to merge 12 commits intoopen-mmlab:masterfrom rstrudel:master
Conversation
|
Hi @rstrudel |
Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>
|
Hi @Junjun2016 , einops is quite useful, especially for weights conversion for example: |
Our principle is to use as few third-party dependencies as possible. |
| loss_decode=dict( | ||
| type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
| ), | ||
| test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(512, 512)), |
There was a problem hiding this comment.
It seems that the sliding window has no overlap.
Is it the same with your paper setting?
| loss_decode=dict( | ||
| type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
| ), | ||
| test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(512, 512)), |
There was a problem hiding this comment.
It seems that the sliding window has no overlap.
Is it the same with your paper setting?
| @@ -0,0 +1,34 @@ | |||
| # model settings | |||
There was a problem hiding this comment.
We should not put too many configs in _base_, just put one base config.
And inherit the base config to generate different configs with different settings (model, dataset, or setting ) in configs/segmenter by overriding the keys with different values.
There was a problem hiding this comment.
I moved the files to configs/segmenter in https://github.com/rstrudel/mmsegmentation/commit/58f7bece4e7a63837fa453147a3c91724ed2f2a5
| @@ -0,0 +1,20 @@ | |||
| _base_ = [ | |||
There was a problem hiding this comment.
Rename configs/segmenter/segmenter_vit-b_linear_512x512_160k_bs8_ade20k.py to configs/segmenter/segmenter_vit-b_linear_8x1_512x512_160k_ade20k.py
Refer to https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1.
There was a problem hiding this comment.
Update other configs according to this config's comments.
| '../_base_/default_runtime.py', | ||
| '../_base_/schedules/schedule_160k.py', | ||
| ] | ||
| find_unused_parameters = True |
There was a problem hiding this comment.
| find_unused_parameters = True |
Remove find_unused_parameters.
| type='SegmenterLinearHead', | ||
| in_channels=768, | ||
| channels=768, | ||
| num_classes=20, |
There was a problem hiding this comment.
| num_classes=20, | |
| num_classes=19, |
The number of classes is 19 on cityscapes.
| auxiliary_head=[], | ||
| test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(512, 512)), |
There was a problem hiding this comment.
Can be removed since these are the same with base config.
| optimizer = dict(lr=0.001, weight_decay=0.0) | ||
|
|
||
| # num_gpus: 8 -> batch_size: 8 | ||
| data = dict(samples_per_gpu=1, ) |
There was a problem hiding this comment.
| data = dict(samples_per_gpu=1, ) | |
| data = dict(samples_per_gpu=1) |
| # num_gpus: 8 -> batch_size: 8 | ||
| data = dict(samples_per_gpu=1, ) | ||
| # TODO: handle img_norm_cfg | ||
| # img_norm_cfg = dict(mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True) |
There was a problem hiding this comment.
Did you use this img_norm_cfg in your paper?
There was a problem hiding this comment.
Yes, the normalization of ViT is [0.5, 0.5, 0.5] for mean and std (assuming the input tensor is in [0, 1]).
In my repository, I checked thtat the loading and normalization used for ViT checkpoints was valid by checking the resulting performances on ImageNet validation set which were correct.
There was a problem hiding this comment.
So, it means that we should use this img_norm_cfg in the segmenter' base config?
There was a problem hiding this comment.
Yes, the normalization of ViT is [0.5, 0.5, 0.5] for mean and std (assuming the input tensor is in [0, 1]). In my repository, I checked thtat the loading and normalization used for ViT checkpoints was valid by checking the resulting performances on ImageNet validation set which were correct.
Great, but we also need to align the inference performance first for semantic segmentation and the next step is to align the training performance.
| # num_gpus: 8 -> batch_size: 8 | ||
| data = dict(samples_per_gpu=1, ) | ||
| # TODO: handle img_norm_cfg | ||
| # img_norm_cfg = dict(mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True) |
There was a problem hiding this comment.
Did you use this img_norm_cfg in your paper?
| from .decode_head import BaseDecodeHead | ||
|
|
||
|
|
||
| def init_weights(m, std=0.02): |
There was a problem hiding this comment.
It is suggested that using init_cfg to control init weights (inherit from BaseModule), refer to https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/backbones/resnet.py#L370 and https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/base_module.py#L56.
If the init weight strategy can not be fully covered by init_cfg, we should override the init_weights in BaseModule (https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/base_module.py#L56), refer to https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/backbones/swin.py#L661, https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/backbones/vit.py#L262, and https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/backbones/mit.py#L365.
|
|
||
|
|
||
| @HEADS.register_module() | ||
| class SegmenterLinearHead(BaseDecodeHead): |
There was a problem hiding this comment.
Can inherit from FCNHead and override forward.
| def __init__(self, in_channels, init_std=0.02, **kwargs): | ||
| super(SegmenterLinearHead, self).__init__( | ||
| in_channels=in_channels, **kwargs) | ||
| self.head = nn.Linear(in_channels, self.num_classes) |
There was a problem hiding this comment.
Can use 1x1 conv instead.
|
Hi @rstrudel |
|
|
||
|
|
||
| @HEADS.register_module() | ||
| class SegmenterMaskTransformerHead(BaseDecodeHead): |
There was a problem hiding this comment.
Should also refactor this segmentation head according to the above comments.
| class SegmenterLinearHead(BaseDecodeHead): | ||
|
|
||
| def __init__(self, in_channels, init_std=0.02, **kwargs): | ||
| super(SegmenterLinearHead, self).__init__( |
There was a problem hiding this comment.
Missing docstring and unittests.
| class SegmenterMaskTransformerHead(BaseDecodeHead): | ||
|
|
||
| def __init__( | ||
| self, |
There was a problem hiding this comment.
Missing docstring and unittests.
Thanks for all the comments @Junjun2016, I will work on them as soon as I can. Let me create a PR from a new branch. |
|
I close this PR following #952 (comment) , the new PR is #955 and I will make progress there. |
* fix demo * update * fix * fix bug * fix bug * update doc
Motivation
Add Segmenter method to mmsegmentation.
Modification
I added configuration files to train segmenter on ADE20K and Cityscapes. I also added a script to convert the original ViT checkpoints in JAX to checkpoints compatible with the ViT class of mmsegmentation.
To be done
What's not there yet:
img_norm_cfg=[127.5, 127.5, 127.5]as default for ViT checkpoints