Selectively enable different frontends by narendasan · Pull Request #2693 · pytorch/TensorRT

narendasan · 2024-03-15T23:03:04Z

Description

Allows users to configure which frontends/features they want to include in the torch-tensorrt builds.

Features can be enabled as so:

NO_TORCHSCRIPT=1 pip install -e . ... # Includes the C++ runtime but no Torchscript frontend
PYTHON_ONLY=1 pip install -e . ... # No C++ dependencies at all, a pure python package

A builds feature set can be accessed via the following struct

import torch_tensorrt

torch_tensorrt.ENABLED_FEATURES

where ENABLED_FEATURES is a namedtuple FeatureSet:

namedtuple('FeatureSet', [
    "torchscript_frontend",
    "torch_tensorrt_runtime",
    "dynamo_frontend",
    "fx_frontend"
])

In order to support optional features, a number of core types have been abstracted:

torch_tensorrt.Device has no direct dependencies on the torchscript core and can be translated to torch_tensorrt.ts.Device to access those features
Similarly torch_tensorrt.Input behaves the same way
enums for dtype,DeviceType, memory_format have been defined and can translate from numpy, tensorrt, torch and torch_tensorrt._C (assuming torch_tensorrt.ENABLED_FEATURES.torchscript_frontend is True)

Translating between different library enums now can take the form

import torch
import tensorrt as trt
import torch_tensorrt

# Convert, throw error if no conversion exists
trt_dtype = torch_tensorrt.dtype._from(torch.float16).to(trt.DataType) # returns trt.DataType.HALF

#Try to convert, do not throw an error if there is no conversion, just return None
trt_dtype = torch_tensorrt.dtype.try_from(torch.float16).to(trt.DataType) # returns trt.DataType.HALF

# Alternatively, use a fallback type
trt_dtype = torch_tensorrt.dtype.unknown.to(trt.DataType, use_default=True) #returns trt.DataType.FLOAT

Fixes #1943
Fixes #2379

Type of change

Please delete options that are not relevant and/or add your own.

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

gs-olive

Overall looks great; overhaul of dtype system and frontend selection at install is very helpful. I just need to verify on local installs as well. Left a few small comments

py/torch_tensorrt/__init__.py

gs-olive · 2024-03-20T22:46:47Z

py/torch_tensorrt/dynamo/_compiler.py

-        or torch_tensorrt.dtype.float in enabled_precisions
-    ):
-        precision = torch.float32
+    if dtype.float16 in enabled_precisions or dtype.half in enabled_precisions:


dtype.float16 and dtype.half seem to point to the same enum object. Should this be:

if dtype.float16 in enabled_precisions or torch.float16 in enabled_precisions:

gs-olive · 2024-03-20T22:46:53Z

py/torch_tensorrt/dynamo/_compiler.py

-        precision = torch.float32
+    if dtype.float16 in enabled_precisions or dtype.half in enabled_precisions:
+        precision = dtype.float16
+    elif dtype.float32 in enabled_precisions or dtype.float in enabled_precisions:


Same as above comment

gs-olive · 2024-03-20T22:48:13Z

py/torch_tensorrt/dynamo/_compiler.py

+    if dtype.float16 in enabled_precisions or dtype.half in enabled_precisions:
+        precision = dtype.float16
+    elif dtype.float32 in enabled_precisions or dtype.float in enabled_precisions:
+        precision = dtype.float32


Same as above

py/torch_tensorrt/dynamo/_settings.py

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

…pes in the python package to decouple frontends Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

dynamo Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

torch.fx.passes.splitter_base._SplitterBase Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

narendasan · 2024-04-16T17:57:01Z

py/torch_tensorrt/_compile.py

+                        "Input is a torchscript module but the ir was not specified (default=dynamo), please set ir=torchscript to suppress the warning."
                    )
                return _IRType.ts
            elif module_is_exportable:


Need to add feature check

narendasan · 2024-04-16T17:58:22Z

py/torch_tensorrt/_compile.py

    inputs: Optional[Sequence[Input | torch.Tensor]] = None,
    ir: str = "default",
-    enabled_precisions: Optional[Set[torch.dtype | dtype]] = None,
+    enabled_precisions: Optional[Set[torch.dtype]] = None,


Correct type annotation

py/torch_tensorrt/_compile.py

py/torch_tensorrt/dynamo/conversion/impl/select.py

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

gs-olive

Overall looks great - added some small style/clarification/logging comments

py/torch_tensorrt/_compile.py

py/torch_tensorrt/_enums.py

py/torch_tensorrt/dynamo/conversion/converter_utils.py

gs-olive · 2024-04-17T00:18:02Z

Additionally, tested on Windows E2E models and appears to cleanly dispatch to the correct runtime without needing to modify the use_python_runtime flag

gs-olive · 2024-04-17T00:25:21Z

A few examples, such as the one below, use precision still and would fail with some of the changes.

TensorRT/examples/dynamo/torch_compile_stable_diffusion.py

Line 42 in c5b8909

"precision": torch.float16,

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

peri044

LGTM. Added minor comments

peri044 · 2024-04-17T07:09:08Z

py/torch_tensorrt/_enums.py

+        use_default: bool,
+    ) -> Optional[Union[torch.dtype, trt.DataType, np.dtype, dtype]]:
+        try:
+            print(self)


can remove this statement

peri044 · 2024-04-17T07:20:49Z

py/torch_tensorrt/_enums.py

+        elif t == DeviceType:
+            return self


I'm curious when do we need to cast EngineCapability to DeviceType. Any examples ?

peri044 · 2024-04-17T07:31:31Z

py/torch_tensorrt/dynamo/_compiler.py

-    use_fast_partitioner: bool = USE_FAST_PARTITIONER,
-    enable_experimental_decompositions: bool = ENABLE_EXPERIMENTAL_DECOMPOSITIONS,
+    enabled_precisions: Set[torch.dtype | dtype] | Tuple[torch.dtype | dtype] = (
+        dtype.float32,


Shouldn't this be _defaults.ENABLED_PRECISIONS ?

peri044 · 2024-04-17T07:32:22Z

py/torch_tensorrt/dynamo/_compiler.py


    compilation_options = {
-        "precision": precision,
+        "enabled_precisions": enabled_precisions,


It seems like you've handled enabled_precisions to be empty scenario in compile function. We can use the same here

"enabled_precisions": ( enabled_precisions if enabled_precisions else _defaults.ENABLED_PRECISIONS

peri044 · 2024-04-17T08:11:32Z

toolchains/legacy/pyproject.toml

@@ -64,7 +64,7 @@ include-package-data = false



Where would this file be used ?

peri044 · 2024-04-17T15:48:23Z

Will docs be updated with the instructions as a different PR ?

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

…2761) Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com> Co-authored-by: Naren Dasan <1790613+narendasan@users.noreply.github.com>

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

facebook-github-bot added the cla signed label Mar 15, 2024

github-actions bot requested a review from peri044 March 15, 2024 23:03

This comment was marked as outdated.

Sign in to view

github-actions bot added the component: converters Issues re: Specific op converters label Mar 20, 2024

narendasan requested a review from gs-olive March 20, 2024 17:03

gs-olive reviewed Mar 20, 2024

View reviewed changes

narendasan force-pushed the min_cpp_build branch from 0045465 to ae453fc Compare March 22, 2024 21:33

gs-olive mentioned this pull request Apr 12, 2024

fix: Windows CI Experimentation #2702

Closed

narendasan added 7 commits April 16, 2024 09:36

feat: New bazel target for runtime only package

9f8e2c6

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

refactor: Allow features to be selectively enabled and create core ty…

ab3d472

…pes in the python package to decouple frontends Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

chore: Mypy and bug fixes for enums, removing uses of FX dtypes in

6baa315

dynamo Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

chore: Make sure to mark builds as platform specific

c6a2dfd

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

chore: some reorg and internal cleanup, addressing review comments

fd8757f

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

refactor: Require output types to be provided to TRTInterpreter

6dd6915

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

fix(partitioner): Update adjacency partitioner to handle changes in

b21fb5f

torch.fx.passes.splitter_base._SplitterBase Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

narendasan force-pushed the min_cpp_build branch from 959f3e5 to b21fb5f Compare April 16, 2024 17:41

narendasan commented Apr 16, 2024

View reviewed changes

py/torch_tensorrt/_compile.py Show resolved Hide resolved

narendasan commented Apr 16, 2024

View reviewed changes

py/torch_tensorrt/dynamo/conversion/impl/select.py Show resolved Hide resolved

narendasan added 3 commits April 16, 2024 13:10

chore: Fix merge bugs

a26de54

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

chore: Adjusting the logging slightly

43c4928

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

chore: address small issues in torch_tensorrt._compile

46ecb24

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

gs-olive reviewed Apr 17, 2024

View reviewed changes

py/torch_tensorrt/_compile.py Outdated Show resolved Hide resolved

py/torch_tensorrt/_compile.py Outdated Show resolved Hide resolved

py/torch_tensorrt/_enums.py Outdated Show resolved Hide resolved

py/torch_tensorrt/dynamo/conversion/converter_utils.py Outdated Show resolved Hide resolved

narendasan added 2 commits April 16, 2024 17:50

chore: address small bugs

a83ba8c

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

chore: address review comments

c272b78

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

peri044 approved these changes Apr 17, 2024

View reviewed changes

chore: small linting

be0e0e3

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

github-actions bot added the documentation Improvements or additions to documentation label Apr 17, 2024

narendasan merged commit 9cf3356 into main Apr 17, 2024

peri044 pushed a commit that referenced this pull request Apr 18, 2024

Selectively enable different frontends (#2693)

9afa677

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

laikhtewari pushed a commit that referenced this pull request May 24, 2024

Selectively enable different frontends (#2693)

0c0356a

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

Conversation

narendasan commented Mar 15, 2024

Description

Type of change

Checklist:

Uh oh!

This comment was marked as outdated.

Uh oh!

gs-olive left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gs-olive left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gs-olive commented Apr 17, 2024

Uh oh!

gs-olive commented Apr 17, 2024

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peri044 commented Apr 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants