[torchlib] Register aten.linear and use matmul to simplify graph #2021

justinchuby · 2025-01-18T00:58:22Z

Use matmul when the input is not rank 2 to avoid decomp to addmm.

justinchuby · 2025-01-18T01:04:18Z

import torch


class TestModule(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = torch.nn.Linear(10, 10)

    def forward(self, x):
        return self.linear(x)


model = TestModule()
ep = torch.onnx.export(model, (torch.randn(1, 10),), dynamo=True, verify=True)
print(ep)
ep = torch.onnx.export(model, (torch.randn(1, 12, 15, 10),), dynamo=True, verify=True)
print(ep)

[torch.onnx] Obtain model graph for `TestModule([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `TestModule([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
[torch.onnx] Check the ONNX model...
[torch.onnx] Check the ONNX model... ✅
[torch.onnx] Execute the model with ONNX Runtime...
[torch.onnx] Execute the model with ONNX Runtime... ✅
[torch.onnx] Verify output accuracy...
[torch.onnx] Verify output accuracy... ✅
ONNXProgram(
    model=
        <
            ir_version=10,
            opset_imports={'pkg.onnxscript.torch_lib.common': 1, '': 18},
            producer_name='pytorch',
            producer_version='2.7.0.dev20250115+cu124',
            domain=None,
            model_version=None,
        >
        graph(
            name=main_graph,
            inputs=(
                %"x"<FLOAT,[1,10]>
            ),
            outputs=(
                %"linear"<FLOAT,[1,10]>
            ),
            initializers=(
                %"linear.weight"<FLOAT,[10,10]>,
                %"linear.bias"<FLOAT,[10]>
            ),
        ) {
            0 |  # node_Gemm_0
                 %"linear"<FLOAT,[1,10]> ⬅️ ::Gemm(%"x", %"linear.weight", %"linear.bias") {beta=1.0, transB=True, alpha=1.0, transA=0}
            return %"linear"<FLOAT,[1,10]>
        }

        <
            opset_imports={'': 18},
        >
        def pkg.onnxscript.torch_lib.common::Rank(
            inputs=(
                %"input"<?,?>
            ),
            outputs=(
                %"return_val"<?,?>
            ),
        ) {
            0 |  # n0
                 %"tmp"<?,?> ⬅️ ::Shape(%"input")
            1 |  # n1
                 %"return_val"<?,?> ⬅️ ::Size(%"tmp")
            return %"return_val"<?,?>
        }

        <
            opset_imports={'': 18},
        >
        def pkg.onnxscript.torch_lib.common::IsScalar(
            inputs=(
                %"input"<?,?>
            ),
            outputs=(
                %"return_val"<?,?>
            ),
        ) {
            0 |  # n0
                 %"tmp"<?,?> ⬅️ ::Shape(%"input")
            1 |  # n1
                 %"tmp_0"<?,?> ⬅️ ::Size(%"tmp")
            2 |  # n2
                 %"tmp_1"<?,?> ⬅️ ::Constant() {value_int=0}
            3 |  # n3
                 %"return_val"<?,?> ⬅️ ::Equal(%"tmp_0", %"tmp_1")
            return %"return_val"<?,?>
        }
    ,
    exported_program=
        ExportedProgram:
            class GraphModule(torch.nn.Module):
                def forward(self, p_linear_weight: "f32[10, 10]", p_linear_bias: "f32[10]", x: "f32[1, 10]"):
                     # File: /home/justinchu/anaconda3/envs/onnx/lib/python3.13/site-packages/torch/nn/modules/linear.py:125 in forward, code: return F.linear(input, self.weight, self.bias)
                    linear: "f32[1, 10]" = torch.ops.aten.linear.default(x, p_linear_weight, p_linear_bias);  x = p_linear_weight = p_linear_bias = None
                    return (linear,)
            
        Graph signature: ExportGraphSignature(input_specs=[InputSpec(kind=<InputKind.PARAMETER: 2>, arg=TensorArgument(name='p_linear_weight'), target='linear.weight', persistent=None), InputSpec(kind=<InputKind.PARAMETER: 2>, arg=TensorArgument(name='p_linear_bias'), target='linear.bias', persistent=None), InputSpec(kind=<InputKind.USER_INPUT: 1>, arg=TensorArgument(name='x'), target=None, persistent=None)], output_specs=[OutputSpec(kind=<OutputKind.USER_OUTPUT: 1>, arg=TensorArgument(name='linear'), target=None)])
        Range constraints: {}

)

[torch.onnx] Obtain model graph for `TestModule([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `TestModule([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
[torch.onnx] Check the ONNX model...
[torch.onnx] Check the ONNX model... ✅
[torch.onnx] Execute the model with ONNX Runtime...
[torch.onnx] Execute the model with ONNX Runtime... ✅
[torch.onnx] Verify output accuracy...
[torch.onnx] Verify output accuracy... ✅
ONNXProgram(
    model=
        <
            ir_version=10,
            opset_imports={'pkg.onnxscript.torch_lib.common': 1, '': 18},
            producer_name='pytorch',
            producer_version='2.7.0.dev20250115+cu124',
            domain=None,
            model_version=None,
        >
        graph(
            name=main_graph,
            inputs=(
                %"x"<FLOAT,[1,12,15,10]>
            ),
            outputs=(
                %"linear"<FLOAT,[1,12,15,10]>
            ),
            initializers=(
                %"linear.weight"<FLOAT,[10,10]>,
                %"linear.bias"<FLOAT,[10]>
            ),
        ) {
            0 |  # node_Transpose_0
                 %"val_0"<?,?> ⬅️ ::Transpose(%"linear.weight") {perm=[1, 0]}
            1 |  # node_MatMul_1
                 %"val_1"<?,?> ⬅️ ::MatMul(%"x", %"val_0")
            2 |  # node_Add_2
                 %"linear"<FLOAT,[1,12,15,10]> ⬅️ ::Add(%"val_1", %"linear.bias")
            return %"linear"<FLOAT,[1,12,15,10]>
        }

        <
            opset_imports={'': 18},
        >
        def pkg.onnxscript.torch_lib.common::Rank(
            inputs=(
                %"input"<?,?>
            ),
            outputs=(
                %"return_val"<?,?>
            ),
        ) {
            0 |  # n0
                 %"tmp"<?,?> ⬅️ ::Shape(%"input")
            1 |  # n1
                 %"return_val"<?,?> ⬅️ ::Size(%"tmp")
            return %"return_val"<?,?>
        }

        <
            opset_imports={'': 18},
        >
        def pkg.onnxscript.torch_lib.common::IsScalar(
            inputs=(
                %"input"<?,?>
            ),
            outputs=(
                %"return_val"<?,?>
            ),
        ) {
            0 |  # n0
                 %"tmp"<?,?> ⬅️ ::Shape(%"input")
            1 |  # n1
                 %"tmp_0"<?,?> ⬅️ ::Size(%"tmp")
            2 |  # n2
                 %"tmp_1"<?,?> ⬅️ ::Constant() {value_int=0}
            3 |  # n3
                 %"return_val"<?,?> ⬅️ ::Equal(%"tmp_0", %"tmp_1")
            return %"return_val"<?,?>
        }
    ,
    exported_program=
        ExportedProgram:
            class GraphModule(torch.nn.Module):
                def forward(self, p_linear_weight: "f32[10, 10]", p_linear_bias: "f32[10]", x: "f32[1, 12, 15, 10]"):
                     # File: /home/justinchu/anaconda3/envs/onnx/lib/python3.13/site-packages/torch/nn/modules/linear.py:125 in forward, code: return F.linear(input, self.weight, self.bias)
                    linear: "f32[1, 12, 15, 10]" = torch.ops.aten.linear.default(x, p_linear_weight, p_linear_bias);  x = p_linear_weight = p_linear_bias = None
                    return (linear,)
            
        Graph signature: ExportGraphSignature(input_specs=[InputSpec(kind=<InputKind.PARAMETER: 2>, arg=TensorArgument(name='p_linear_weight'), target='linear.weight', persistent=None), InputSpec(kind=<InputKind.PARAMETER: 2>, arg=TensorArgument(name='p_linear_bias'), target='linear.bias', persistent=None), InputSpec(kind=<InputKind.USER_INPUT: 1>, arg=TensorArgument(name='x'), target=None, persistent=None)], output_specs=[OutputSpec(kind=<OutputKind.USER_OUTPUT: 1>, arg=TensorArgument(name='linear'), target=None)])
        Range constraints: {}

)

codecov · 2025-01-18T01:04:45Z

❌ 51 Tests Failed:

Tests completed	Failed	Passed	Skipped
11814	51	11763	2454

View the top 1 failed tests by shortest run time

onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0212_test_cast_STRING_to_FLOAT

Stack Traces | 0.003s run time

onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_cast_STRING_to_FLOAT'

The above exception was the direct cause of the following exception:
.nox\test\lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_cast_STRING_to_FLOAT' (e=No module named 'tests.onnx_backend_test_code.test_cast_STRING_to_FLOAT') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_cast_STRING_to_FLOAT.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_cast_STRING_to_FLOAT.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import FLOAT, STRING
E   from onnxscript.onnx_opset import opset21
E   
E   @script()
E   def bck_test_cast_STRING_to_FLOAT(input: STRING[3,4]) -> (FLOAT[3,4]):
E       output = opset21.Cast(input, to=1)
E       return output

View the full list of 2 ❄️ flaky tests

onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0634_test_max_int32

Flake rate in main: 5.00% (Passed 38 times, Failed 2 times)

Stack Traces | 0.003s run time

onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_max_int32'

The above exception was the direct cause of the following exception:
.nox\test\lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_max_int32' (e=No module named 'tests.onnx_backend_test_code.test_max_int32') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_max_int32.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_max_int32.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import INT32
E   from onnxscript.onnx_opset import opset13
E   
E   @script()
E   def bck_test_max_int32(data_0: INT32[3], data_1: INT32[3]) -> (INT32[3]):
E       result = opset13.Max(data_0, data_1)
E       return result

onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_1104_test_shape_start_1_end_2

Flake rate in main: 12.50% (Passed 14 times, Failed 2 times)

Stack Traces | 0.003s run time

onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_shape_start_1_end_2'

The above exception was the direct cause of the following exception:
.nox\test\lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_shape_start_1_end_2' (e=No module named 'tests.onnx_backend_test_code.test_shape_start_1_end_2') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_shape_start_1_end_2.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_shape_start_1_end_2.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import FLOAT, INT64
E   from onnxscript.onnx_opset import opset21
E   
E   @script()
E   def bck_test_shape_start_1_end_2(x: FLOAT[3,4,5]) -> (INT64[1]):
E       y = opset21.Shape(x, end=2, start=1)
E       return y

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

titaiwangms · 2025-01-18T01:04:51Z

https://github.com/microsoft/onnxscript/pull/1821/files also disabled linear_bias though, do we want that back as well?

justinchuby · 2025-01-18T01:08:13Z

Dynamic axes works too

model = TestModule()
ep = torch.onnx.export(model, (torch.randn(2, 10),), dynamic_shapes=(
    {0: torch.export.Dim.DYNAMIC},
), dynamo=True, verify=True)
print(ep)
ep = torch.onnx.export(model, (torch.randn(2, 12, 15, 10),), dynamo=True, dynamic_shapes=(
    {0: torch.export.Dim.DYNAMIC},
), verify=True)
print(ep)

justinchuby · 2025-01-18T01:08:54Z

#1821 (files) also disabled linear_bias though, do we want that back as well?

I combined the implementation.

### Description This PR adds fusions for [Google's SigLIP model](https://huggingface.co/google/siglip-base-patch16-224/) and Microsoft's internal conformer-encoder model. Here is an example of how to run the ORT transformer optimizer for the SigLIP model. ``` $ git clone https://github.com/microsoft/onnxruntime $ cd onnxruntime/onnxruntime/python/tools/transformers $ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type clip --num_heads 16 --hidden_size 1152 --use_external_data_format --opt_level 0 --disable_shape_inference ``` Here is an example of how to run the ORT transformer optimizer for the conformer-encoder model. ``` $ git clone https://github.com/microsoft/onnxruntime $ cd onnxruntime/onnxruntime/python/tools/transformers $ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type conformer --num_heads 16 --hidden_size 1024 --use_external_data_format --opt_level 0 --disable_shape_inference --convert_attribute ``` ### Motivation and Context This PR helps optimize multi-modal models that use SigLIP for the vision encoder and conformer-encoder for the speech encoder. This PR uses changes from the following PRs: - pytorch/pytorch#144801 - microsoft/onnxscript#2018 - microsoft/onnxscript#2019 - microsoft/onnxscript#2020 - microsoft/onnxscript#2021 - microsoft/onnxscript#2022 - microsoft/onnxscript#2024 - microsoft/onnxscript#2025 - microsoft/onnxscript#2029 - microsoft/onnxscript#2033 ### Introduction of ONNX Script This PR introduces [ONNX Script](https://github.com/microsoft/onnxscript) into the ORT transformer optimizer as an optional step via the `fold_transpose_initializers()` method of the `DynamoOnnxHelper` class.

justinchuby added 2 commits January 17, 2025 16:46

[torchlib] Register linear and use matmul to simplify graph

de19f32

gemm

61a474e

justinchuby requested review from shubhambhokare1 and titaiwangms January 18, 2025 00:58

justinchuby added the module: torchlib Related to the torch/aten function lib in development label Jan 18, 2025

justinchuby added 2 commits January 17, 2025 16:59

Merge branch 'main' into justinchu/better-linear

77e2a2f

test

32db4bc

titaiwangms approved these changes Jan 18, 2025

View reviewed changes

justinchuby enabled auto-merge (squash) January 18, 2025 01:02

justinchuby merged commit e7d199e into main Jan 18, 2025
22 of 27 checks passed

justinchuby deleted the justinchu/better-linear branch January 18, 2025 01:19

kunal-vaishnavi mentioned this pull request Jan 29, 2025

Add fusions for SigLIP and Conformer-Encoder microsoft/onnxruntime#23528

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[torchlib] Register aten.linear and use matmul to simplify graph #2021

[torchlib] Register aten.linear and use matmul to simplify graph #2021

Uh oh!

justinchuby commented Jan 18, 2025 •

edited

Loading

Uh oh!

justinchuby commented Jan 18, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jan 18, 2025 •

edited

Loading

Uh oh!

titaiwangms commented Jan 18, 2025

Uh oh!

justinchuby commented Jan 18, 2025

Uh oh!

justinchuby commented Jan 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[torchlib] Register aten.linear and use matmul to simplify graph #2021

[torchlib] Register aten.linear and use matmul to simplify graph #2021

Uh oh!

Conversation

justinchuby commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinchuby commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ 51 Tests Failed:

Uh oh!

titaiwangms commented Jan 18, 2025

Uh oh!

justinchuby commented Jan 18, 2025

Uh oh!

justinchuby commented Jan 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

justinchuby commented Jan 18, 2025 •

edited

Loading

justinchuby commented Jan 18, 2025 •

edited

Loading

codecov bot commented Jan 18, 2025 •

edited

Loading