Skip to content

[Feature] Support Segmenter#952

Closed
rstrudel wants to merge 12 commits intoopen-mmlab:masterfrom
rstrudel:master
Closed

[Feature] Support Segmenter#952
rstrudel wants to merge 12 commits intoopen-mmlab:masterfrom
rstrudel:master

Conversation

@rstrudel
Copy link
Copy Markdown
Contributor

@rstrudel rstrudel commented Oct 11, 2021

Motivation

Add Segmenter method to mmsegmentation.

Modification

I added configuration files to train segmenter on ADE20K and Cityscapes. I also added a script to convert the original ViT checkpoints in JAX to checkpoints compatible with the ViT class of mmsegmentation.

To be done

What's not there yet:

  • use img_norm_cfg=[127.5, 127.5, 127.5] as default for ViT checkpoints
  • checkpoints, I reported the performances in the readme for the ones I trained

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Oct 11, 2021

CLA assistant check
All committers have signed the CLA.

@rstrudel rstrudel changed the title [Feature] Add Segmenter [Feature] Support Segmenter Oct 11, 2021
@Junjun2016
Copy link
Copy Markdown
Collaborator

Hi @rstrudel
Nice PR!
Please fix the CI error and we will review it ASAP.

Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>
@rstrudel
Copy link
Copy Markdown
Contributor Author

Hi @Junjun2016 ,
The CI error seems to come from the fact that I am using einops. I thought https://github.com/open-mmlab/mmsegmentation/pull/952/files#diff-fa602a8a75dc9dcc92261bac5f533c2a85e34fcceaff63b3a3a81d9acde2fc52R11 would fix it and include einops as a dependency but it does not seem to be the case. Could you please tell me how can I add a dependency?

einops is quite useful, especially for weights conversion for example:
https://github.com/open-mmlab/mmsegmentation/pull/952/files#diff-c3a364134054b115f5191ac5679e696ff049bab602857b3d7fd29904e108d5c3R1-R116

@Junjun2016
Copy link
Copy Markdown
Collaborator

Hi @Junjun2016 , The CI error seems to come from the fact that I am using einops. I thought https://github.com/open-mmlab/mmsegmentation/pull/952/files#diff-fa602a8a75dc9dcc92261bac5f533c2a85e34fcceaff63b3a3a81d9acde2fc52R11 would fix it and include einops as a dependency but it does not seem to be the case. Could you please tell me how can I add a dependency?

einops is quite useful, especially for weights conversion for example: https://github.com/open-mmlab/mmsegmentation/pull/952/files#diff-c3a364134054b115f5191ac5679e696ff049bab602857b3d7fd29904e108d5c3R1-R116

Our principle is to use as few third-party dependencies as possible.
Maybe we can use einops in PyTorch.

loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
),
test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(512, 512)),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the sliding window has no overlap.
Is it the same with your paper setting?

loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
),
test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(512, 512)),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the sliding window has no overlap.
Is it the same with your paper setting?

@@ -0,0 +1,34 @@
# model settings
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not put too many configs in _base_, just put one base config.
And inherit the base config to generate different configs with different settings (model, dataset, or setting ) in configs/segmenter by overriding the keys with different values.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,20 @@
_base_ = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename configs/segmenter/segmenter_vit-b_linear_512x512_160k_bs8_ade20k.py to configs/segmenter/segmenter_vit-b_linear_8x1_512x512_160k_ade20k.py
Refer to https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update other configs according to this config's comments.

Copy link
Copy Markdown
Contributor Author

@rstrudel rstrudel Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 9deb2dc

'../_base_/default_runtime.py',
'../_base_/schedules/schedule_160k.py',
]
find_unused_parameters = True
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
find_unused_parameters = True

Remove find_unused_parameters.

type='SegmenterLinearHead',
in_channels=768,
channels=768,
num_classes=20,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
num_classes=20,
num_classes=19,

The number of classes is 19 on cityscapes.

Comment on lines +11 to +12
auxiliary_head=[],
test_cfg=dict(mode='slide', crop_size=(512, 512), stride=(512, 512)),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed since these are the same with base config.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 6f5ecd5

optimizer = dict(lr=0.001, weight_decay=0.0)

# num_gpus: 8 -> batch_size: 8
data = dict(samples_per_gpu=1, )
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data = dict(samples_per_gpu=1, )
data = dict(samples_per_gpu=1)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 4f5762f

# num_gpus: 8 -> batch_size: 8
data = dict(samples_per_gpu=1, )
# TODO: handle img_norm_cfg
# img_norm_cfg = dict(mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you use this img_norm_cfg in your paper?

Copy link
Copy Markdown
Contributor Author

@rstrudel rstrudel Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the normalization of ViT is [0.5, 0.5, 0.5] for mean and std (assuming the input tensor is in [0, 1]).
In my repository, I checked thtat the loading and normalization used for ViT checkpoints was valid by checking the resulting performances on ImageNet validation set which were correct.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it means that we should use this img_norm_cfg in the segmenter' base config?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the normalization of ViT is [0.5, 0.5, 0.5] for mean and std (assuming the input tensor is in [0, 1]). In my repository, I checked thtat the loading and normalization used for ViT checkpoints was valid by checking the resulting performances on ImageNet validation set which were correct.

Great, but we also need to align the inference performance first for semantic segmentation and the next step is to align the training performance.

# num_gpus: 8 -> batch_size: 8
data = dict(samples_per_gpu=1, )
# TODO: handle img_norm_cfg
# img_norm_cfg = dict(mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you use this img_norm_cfg in your paper?

from .decode_head import BaseDecodeHead


def init_weights(m, std=0.02):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



@HEADS.register_module()
class SegmenterLinearHead(BaseDecodeHead):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can inherit from FCNHead and override forward.

def __init__(self, in_channels, init_std=0.02, **kwargs):
super(SegmenterLinearHead, self).__init__(
in_channels=in_channels, **kwargs)
self.head = nn.Linear(in_channels, self.num_classes)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use 1x1 conv instead.

@Junjun2016
Copy link
Copy Markdown
Collaborator

Hi @rstrudel
Could you please push a new PR from a new branch instead of the master branch?
Since we can not push code to your master branch.



@HEADS.register_module()
class SegmenterMaskTransformerHead(BaseDecodeHead):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also refactor this segmentation head according to the above comments.

class SegmenterLinearHead(BaseDecodeHead):

def __init__(self, in_channels, init_std=0.02, **kwargs):
super(SegmenterLinearHead, self).__init__(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing docstring and unittests.

class SegmenterMaskTransformerHead(BaseDecodeHead):

def __init__(
self,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing docstring and unittests.

@rstrudel
Copy link
Copy Markdown
Contributor Author

rstrudel commented Oct 12, 2021

Hi @rstrudel Could you please push a new PR from a new branch instead of the master branch? Since we can not push code to your master branch.

Thanks for all the comments @Junjun2016, I will work on them as soon as I can. Let me create a PR from a new branch.

@rstrudel
Copy link
Copy Markdown
Contributor Author

I close this PR following #952 (comment) , the new PR is #955 and I will make progress there.

@rstrudel rstrudel closed this Oct 14, 2021
michaelzhang-ai pushed a commit to michaelzhang-ai/mmsegmentation that referenced this pull request Mar 22, 2024
* fix demo

* update

* fix

* fix bug

* fix bug

* update doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants