[REQUEST] Dynamic model offload support ZeRO-3 inference models 

**Is your feature request related to a problem? Please describe.**
The issue is related to #5620 and #6011. When having a deespeed model initialised for ZeRO-3 inference, with a [`DeepSpeedZeRoOffload` optimizer](https://github.com/microsoft/DeepSpeed/blob/8cded575a94e296fee751072e862304676c95316/deepspeed/runtime/zero/parameter_offload.py#L80) for example, the model cannot be moved to the CPU either by using the `torch.nn.module.to()` functionality or with the new `offload_states` [API](https://github.com/microsoft/DeepSpeed/pull/6011). 

**Describe the solution you'd like**
Either extend #6011 to support offload of a model configured for ZeRO-3 inference or a new API that supports this. 

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REQUEST] Dynamic model offload support ZeRO-3 inference models #6595

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[REQUEST] Dynamic model offload support ZeRO-3 inference models #6595

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions