🐛 [Bug] Torch-TensorRT do not support gpt2

##  Bug Description

```
ERROR: [Torch-TensorRT] - Unsupported operator: aten::where.self(Tensor condition, Tensor self, Tensor other) -> (Tensor)
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(206): _attn
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(336): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(395): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(890): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(1047): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
<stdin>(1): <module>

ERROR: [Torch-TensorRT] - Unsupported operator: aten::Int.Tensor(Tensor a) -> (int)

ERROR: [Torch-TensorRT] - Unsupported operator: aten::ScalarImplicit(Tensor a) -> (Scalar)

ERROR: [Torch-TensorRT] - Unsupported operator: aten::where.self(Tensor condition, Tensor self, Tensor other) -> (Tensor)
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(206): _attn
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(336): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(395): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(890): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(1047): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
<stdin>(1): <module>

ERROR: [Torch-TensorRT] - Unsupported operator: aten::Int.Tensor(Tensor a) -> (int)

ERROR: [Torch-TensorRT] - Unsupported operator: aten::ScalarImplicit(Tensor a) -> (Scalar)

WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
WARNING: [Torch-TensorRT] - Truncating graph input type from at::kLong to at::kInt
WARNING: [Torch-TensorRT] - Truncating graph input type from at::kLong to at::kInt
WARNING: [Torch-TensorRT] - Truncating graph input type from at::kLong to at::kInt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 115, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py(2047): embedding
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/sparse.py(158): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(833): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(1047): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
<stdin>(1): <module>
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDAFloatType instead (while checking arguments for embedding)
```

## To Reproduce

```
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch_tensorrt

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2', return_dict=False)
model.eval()

tokens=tokenizer('The cat is on the table.', return_tensors='pt')['input_ids']
traced_model = torch.jit.trace(model, tokens)

compile_settings = {
        "inputs": [torch_tensorrt.Input(
            # min_shape=[1, 3, 224, 224],
            # opt_shape=[1, 3, 512, 512],
            # max_shape=[1, 3, 1024, 1024],
            # For static size
            shape=[1, 7],
            dtype=torch.int32,# Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
        )],
        "truncate_long_and_double": True,
        "enabled_precisions": {torch.half}  # Run with FP16
    }
trt_model = torch_tensorrt.compile(traced_model, **compile_settings)
```

## Expected behavior

No error

## Environment

`docker build --build-arg BASE=21.10 -f docker/Dockerfile -t torch_tensorrt:latest .`

 - Torch-TensorRT Version (e.g. 1.0.0):
 - PyTorch Version (e.g. 1.0):
 - CPU Architecture:
 - OS (e.g., Linux):
 - How you installed PyTorch (`conda`, `pip`, `libtorch`, source):
 - Build command you used (if compiling from source):
 - Are you using local sources or building from archives:
 - Python version:
 - CUDA version:
 - GPU models and configuration:
 - Any other relevant information:

so do it really means `torch_tensorrt `not support `gpt2` or I did something wrong ？




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 [Bug] Torch-TensorRT do not support gpt2 #867

Bug Description

To Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🐛 [Bug] Torch-TensorRT do not support gpt2 #867

Description

Bug Description

To Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions