-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Closed
Closed
Copy link
Labels
Description
Describe the bug
test-electra.py fails with following error
File "/home/deepspeed/data/DeepSpeed/deepspeed/module_inject/layers.py", line 42, in forward
output = torch.matmul(input, self.weight.transpose(-1, -2))
RuntimeError: mat1 and mat2 shapes cannot be multiplied (18x128 and 256x128)
To Reproduce
Steps to reproduce the behavior:
- Inference Script : https://github.com/microsoft/DeepSpeedExamples/blob/master/inference/huggingface/test-electra.py
- Packages: Deepspeed from master , ff42743, torch 1.12, cuda 11.6, transformers 4.21.2
- deepspeed --num_gpus 2 test-electra.py
Screenshots
If applicable, add screenshots to help explain your problem.
System info (please complete the following information):
- OS: Ubuntu 20.04.5 LTS
- GPU count and types: 2x RTX A6000
- Python version : Python 3.8.10
Additional context
This test does not fail with deepspeed 0.7.3

