Skip to content

Passing multiple models with DeepSpeed will fail  #1388

@uygnef

Description

@uygnef

Hi there! I have a question regarding the latest version. Does it support multiple models? Currently, I am training only the UNet model, but I want to enable stage 3 for all three models. I have modified the code, but unfortunately, it's not working.

    vae, _, _, _ = deepspeed.initialize(model=vae, config_params=vae.configig, model_parameters=None)
    text_encoder, _, _, _ = deepspeed.initialize(model=text_encoder, config_params=ds_config)
    unet, optimizer, training_dataloader, _ = deepspeed.initialize(model=unet, config_params=vae.configig, model_parameters=None)

    # train step
    for step, batch in enumerate(train_dataloader):
        ...
        latents = vae.encode(
                    batch["pixel_values"].to(accelerator.device, dtype=weight_dtype)).latent_dist.sample()
        ...
        encoder_hidden_states = text_encoder(batch["input_ids"].to(accelerator.device))[0]
        ...
        model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
        loss = F.mse_loss(model_pred.float(), target.float(), reduction="mean")
        text_encoder.backward(loss)
        text_encoder.step() 

Another issue I am facing is that both the text encoder and UNet models need to be trained. How can I modify the code to address this?
does hybrid_engine could solve this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions