BaseFinetuning callback can add the same parameter multiple times to the optimizer

## 🐛 Bug

During finetuning with complex models, a call to `BaseFinetuning.unfreeze_and_add_param_group` can raise the following warning:

```
/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py:2882: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
  exec(code_obj, self.user_global_ns, self.user_ns)
```

What happens is that due to the way the BaseFinetuning flattens the model before collecting the parameters, it's possible to list the same parameters twice. It iterates over all of the `.modules()`, but fails to filter it so that only leaf nodes in the model are returned when the model is nested and have custom blocks.

One example of model where the problem happens is:

```
Sequential(
  (encoder): Sequential(
    (0): ConvBlock(
      (conv): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1))
      (act): ReLU()
      (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): ConvBlock(
      (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))
      (act): ReLU()
      (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (decoder): ConvBlock(
    (conv): Conv2d(128, 10, kernel_size=(3, 3), stride=(1, 1))
    (act): ReLU()
    (bn): BatchNorm2d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
)
```

A call to `BaseFinetuning.flatten_modules(model)` using the model above returns both the leaf nodes (conv2d, relu, batchnorm) and the ConvBlocks, listing all the layers twice.

## Please reproduce using the BoringModel

The BoringModel is simple enough that the issue doesn't appear, so I added the most simple model that I could make reproducing the issue.

https://colab.research.google.com/drive/1-YR26kK41kCCNmaL8MYVCL8831FvbHNa?usp=sharing

### Expected behavior

The BaseFinetuning don't try to add the same parameter twice when unfreezing complex models.

### Environment

Bug reproduced using colab, with CPU only runtime

* CUDA:
	- GPU:
	- available:         False
	- version:           10.1
* Packages:
	- numpy:             1.19.5
	- pyTorch_debug:     False
	- pyTorch_version:   1.8.1+cu101
	- pytorch-lightning: 1.2.7
	- tqdm:              4.41.1
* System:
	- OS:                Linux
	- architecture:
		- 64bit
	- processor:         x86_64
	- python:            3.7.10
	- version:           #1 SMP Thu Jul 23 08:00:38 PDT 2020

### Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BaseFinetuning callback can add the same parameter multiple times to the optimizer #6879

🐛 Bug

Please reproduce using the BoringModel

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BaseFinetuning callback can add the same parameter multiple times to the optimizer #6879

Description

🐛 Bug

Please reproduce using the BoringModel

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions