Skip to content

🐛 [Bug] Torch-TensorRT do not support gpt2 #867

@Biaocsu

Description

@Biaocsu

Bug Description

ERROR: [Torch-TensorRT] - Unsupported operator: aten::where.self(Tensor condition, Tensor self, Tensor other) -> (Tensor)
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(206): _attn
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(336): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(395): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(890): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(1047): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
<stdin>(1): <module>

ERROR: [Torch-TensorRT] - Unsupported operator: aten::Int.Tensor(Tensor a) -> (int)

ERROR: [Torch-TensorRT] - Unsupported operator: aten::ScalarImplicit(Tensor a) -> (Scalar)

ERROR: [Torch-TensorRT] - Unsupported operator: aten::where.self(Tensor condition, Tensor self, Tensor other) -> (Tensor)
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(206): _attn
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(336): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(395): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(890): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(1047): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
<stdin>(1): <module>

ERROR: [Torch-TensorRT] - Unsupported operator: aten::Int.Tensor(Tensor a) -> (int)

ERROR: [Torch-TensorRT] - Unsupported operator: aten::ScalarImplicit(Tensor a) -> (Scalar)

WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
WARNING: [Torch-TensorRT] - Truncating graph input type from at::kLong to at::kInt
WARNING: [Torch-TensorRT] - Truncating graph input type from at::kLong to at::kInt
WARNING: [Torch-TensorRT] - Truncating graph input type from at::kLong to at::kInt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 115, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py(2047): embedding
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/sparse.py(158): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(833): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py(1047): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
<stdin>(1): <module>
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDAFloatType instead (while checking arguments for embedding)

To Reproduce

import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch_tensorrt

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2', return_dict=False)
model.eval()

tokens=tokenizer('The cat is on the table.', return_tensors='pt')['input_ids']
traced_model = torch.jit.trace(model, tokens)

compile_settings = {
        "inputs": [torch_tensorrt.Input(
            # min_shape=[1, 3, 224, 224],
            # opt_shape=[1, 3, 512, 512],
            # max_shape=[1, 3, 1024, 1024],
            # For static size
            shape=[1, 7],
            dtype=torch.int32,# Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
        )],
        "truncate_long_and_double": True,
        "enabled_precisions": {torch.half}  # Run with FP16
    }
trt_model = torch_tensorrt.compile(traced_model, **compile_settings)

Expected behavior

No error

Environment

docker build --build-arg BASE=21.10 -f docker/Dockerfile -t torch_tensorrt:latest .

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

so do it really means torch_tensorrt not support gpt2 or I did something wrong ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions