Skip to content

[Fixbug] group_start and group_end should be able importable without nccl#317

Merged
soodoshll merged 1 commit intohidet-org:mainfrom
soodoshll:fix-nccl2
Jul 16, 2023
Merged

[Fixbug] group_start and group_end should be able importable without nccl#317
soodoshll merged 1 commit intohidet-org:mainfrom
soodoshll:fix-nccl2

Conversation

@soodoshll
Copy link
Copy Markdown
Collaborator

No description provided.

@soodoshll soodoshll merged commit ccef079 into hidet-org:main Jul 16, 2023
@soodoshll soodoshll deleted the fix-nccl2 branch August 3, 2023 16:17
vadiklyutiy pushed a commit that referenced this pull request Jul 22, 2024
After disallowing functions unsupported by Hidet as in #317 , the
compilation of the model `vision_maskrcnn` (previously failed on
unsupported `topk` method, as in #267 ) failed on a TypeError with the
following traceback message:

> File
"/home/bolin/Desktop/hidet/python/hidet/graph/graph_utils/functors.py",
line 75, in visit
> ret = self.visit_Operator(obj) # pylint: disable=assignment-from-none
>           ^^^^^^^^^^^^^^^^^^^^^^^^
> File
"/home/bolin/Desktop/hidet/python/hidet/graph/graph_utils/functors.py",
line 126, in visit_Operator
>     updated_outputs = op.reforward(inputs)
>                       ^^^^^^^^^^^^^^^^^^^^
> File "/home/bolin/Desktop/hidet/python/hidet/graph/operator.py", line
185, in reforward
>     return cls(*inputs, **attributes).outputs
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^
> torch._dynamo.exc.BackendCompilerFailed: backend='hidet' raised:
> TypeError: ClampOp.__init__() missing 2 required positional arguments:
'min_value' and 'max_value'


The cause is that, inside the[ `reforward`
function](https://github.com/CentML/hidet/blob/da56e48148c5b075f1fba6d1d878a82889c9f731/python/hidet/graph/operator.py#L180-L185),
during the call to `cls(*inputs, **attributes)`, where `cls` is
`ClampOp`, `inputs` only consists of the input tensor and `attributes`
is an empty dictionary, so the `min_value` and `max_values` cannot be
passed to the initializer. This is because we did not initialize the
`attributes` dictionary to contain the values of these two parameter
[while initializing
`ClampOp`](https://github.com/CentML/hidet/blob/da56e48148c5b075f1fba6d1d878a82889c9f731/python/hidet/graph/ops/arithmetic.py#L586-L595).
vadiklyutiy added a commit that referenced this pull request Jul 22, 2024
This review disallows in fxgraph funcs that are unsupported(non-registered) in hidet.

fxgraph contains functions, methods(methods of `torch.Tensor`) and modules(`torch.nn`). These changes carry about functions only. 

Notes.
1. Works with torch version >= 2.2.0
2. There are a number of functions that allowed and appear in fxgraph on dynamo level but dynamo resolved it before passing the fxgraph to the compiler. If just disallow them we get an additional graph break. As a workaround these funcs are registered but the implementation just raise the exception.
vadiklyutiy pushed a commit that referenced this pull request Jul 23, 2024
After disallowing functions unsupported by Hidet as in #317 , the
compilation of the model `vision_maskrcnn` (previously failed on
unsupported `topk` method, as in #267 ) failed on a TypeError with the
following traceback message:

> File
"/home/bolin/Desktop/hidet/python/hidet/graph/graph_utils/functors.py",
line 75, in visit
> ret = self.visit_Operator(obj) # pylint: disable=assignment-from-none
>           ^^^^^^^^^^^^^^^^^^^^^^^^
> File
"/home/bolin/Desktop/hidet/python/hidet/graph/graph_utils/functors.py",
line 126, in visit_Operator
>     updated_outputs = op.reforward(inputs)
>                       ^^^^^^^^^^^^^^^^^^^^
> File "/home/bolin/Desktop/hidet/python/hidet/graph/operator.py", line
185, in reforward
>     return cls(*inputs, **attributes).outputs
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^
> torch._dynamo.exc.BackendCompilerFailed: backend='hidet' raised:
> TypeError: ClampOp.__init__() missing 2 required positional arguments:
'min_value' and 'max_value'


The cause is that, inside the[ `reforward`
function](https://github.com/CentML/hidet/blob/da56e48148c5b075f1fba6d1d878a82889c9f731/python/hidet/graph/operator.py#L180-L185),
during the call to `cls(*inputs, **attributes)`, where `cls` is
`ClampOp`, `inputs` only consists of the input tensor and `attributes`
is an empty dictionary, so the `min_value` and `max_values` cannot be
passed to the initializer. This is because we did not initialize the
`attributes` dictionary to contain the values of these two parameter
[while initializing
`ClampOp`](https://github.com/CentML/hidet/blob/da56e48148c5b075f1fba6d1d878a82889c9f731/python/hidet/graph/ops/arithmetic.py#L586-L595).
vadiklyutiy added a commit that referenced this pull request Jul 23, 2024
This review disallows in fxgraph funcs that are unsupported(non-registered) in hidet.

fxgraph contains functions, methods(methods of `torch.Tensor`) and modules(`torch.nn`). These changes carry about functions only. 

Notes.
1. Works with torch version >= 2.2.0
2. There are a number of functions that allowed and appear in fxgraph on dynamo level but dynamo resolved it before passing the fxgraph to the compiler. If just disallow them we get an additional graph break. As a workaround these funcs are registered but the implementation just raise the exception.
vadiklyutiy added a commit that referenced this pull request Jul 27, 2024
Time by time after introducing disallow in graph for not supported ops
in #317, the following bug arise
```
>       for module_name, module in sys.modules.items():
E       RuntimeError: dictionary changed size during iteration
```
Fix one possible reason of it.
vadiklyutiy pushed a commit that referenced this pull request Dec 26, 2024
After disallowing functions unsupported by Hidet as in #317 , the
compilation of the model `vision_maskrcnn` (previously failed on
unsupported `topk` method, as in #267 ) failed on a TypeError with the
following traceback message:

> File
"/home/bolin/Desktop/hidet/python/hidet/graph/graph_utils/functors.py",
line 75, in visit
> ret = self.visit_Operator(obj) # pylint: disable=assignment-from-none
>           ^^^^^^^^^^^^^^^^^^^^^^^^
> File
"/home/bolin/Desktop/hidet/python/hidet/graph/graph_utils/functors.py",
line 126, in visit_Operator
>     updated_outputs = op.reforward(inputs)
>                       ^^^^^^^^^^^^^^^^^^^^
> File "/home/bolin/Desktop/hidet/python/hidet/graph/operator.py", line
185, in reforward
>     return cls(*inputs, **attributes).outputs
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^
> torch._dynamo.exc.BackendCompilerFailed: backend='hidet' raised:
> TypeError: ClampOp.__init__() missing 2 required positional arguments:
'min_value' and 'max_value'


The cause is that, inside the[ `reforward`
function](https://github.com/CentML/hidet/blob/da56e48148c5b075f1fba6d1d878a82889c9f731/python/hidet/graph/operator.py#L180-L185),
during the call to `cls(*inputs, **attributes)`, where `cls` is
`ClampOp`, `inputs` only consists of the input tensor and `attributes`
is an empty dictionary, so the `min_value` and `max_values` cannot be
passed to the initializer. This is because we did not initialize the
`attributes` dictionary to contain the values of these two parameter
[while initializing
`ClampOp`](https://github.com/CentML/hidet/blob/da56e48148c5b075f1fba6d1d878a82889c9f731/python/hidet/graph/ops/arithmetic.py#L586-L595).
vadiklyutiy added a commit that referenced this pull request Dec 26, 2024
This review disallows in fxgraph funcs that are unsupported(non-registered) in hidet.

fxgraph contains functions, methods(methods of `torch.Tensor`) and modules(`torch.nn`). These changes carry about functions only. 

Notes.
1. Works with torch version >= 2.2.0
2. There are a number of functions that allowed and appear in fxgraph on dynamo level but dynamo resolved it before passing the fxgraph to the compiler. If just disallow them we get an additional graph break. As a workaround these funcs are registered but the implementation just raise the exception.
vadiklyutiy added a commit that referenced this pull request Dec 26, 2024
Time by time after introducing disallow in graph for not supported ops
in #317, the following bug arise
```
>       for module_name, module in sys.modules.items():
E       RuntimeError: dictionary changed size during iteration
```
Fix one possible reason of it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant