Skip to content

[NFC] Switch dynamic inputs to flow.tensor.dynamic_constant.#21461

Merged
hanhanW merged 1 commit intoiree-org:mainfrom
hanhanW:update-quantized-matmul-matmul-test
Jul 23, 2025
Merged

[NFC] Switch dynamic inputs to flow.tensor.dynamic_constant.#21461
hanhanW merged 1 commit intoiree-org:mainfrom
hanhanW:update-quantized-matmul-matmul-test

Conversation

@hanhanW
Copy link
Contributor

@hanhanW hanhanW commented Jul 23, 2025

The compiler is very smart on static shape inference that can generate partially dynamic shape during the lowering. It makes data-tiling fusion very struggle because they are expected to be dynamic shape but some dimensions are inferred to static values in Stream AnnotateDispatchAssumptions pass. Because it will lead to tensor.cast -> set_encoding -> tensor.cast sequence in a dispatch, while we expect the bindings have encoded tensor types. E.g.,

Input IR:

%0 = iree_tensor_ext.dispatch_load ... tensor<?x?xi8>
%1 = set_encoding %0 : tensor<?x?xi8> -> tensor<?x?xi8, #encoding>
iree_tensor_ext.dispatch_store %1, ... tensor<?x?xi8, #encoding> ->
  ... tensor<?x?xi8, #encoding>

After annotation:

%0 = iree_tensor_ext.dispatch_load ... tensor<4x?xi8>
%cast = tensor.cast %0 : tensor<4x?xi8> -> tensor<?x?xi8>
%1 = set_encoding %cast : tensor<?x?xi8> -> tensor<?x?xi8, #encoding>
%cast_0 = tensor.cast %1 : tensor<?x?xi8, #encoding> to tensor<4x5xi8>
iree_tensor_ext.dispatch_store %cast_0, ... tensor<4x5xi8> ->
  ... tensor<?x?xi8, #encoding>

It is hard to materialize the encodings when cast op is present.

Given that the original goal is testing dynamic shape, modifying the input program is an easier fix.

The issue is observed from #21441.

The compiler is very smart on static shape inference that can generate
partially dynamic shape during the lowering. It makes data-tiling fusion
very struggle because they are expected to be dynamic shape but some
dimensions are inferred to static values in Stream
AnnotateDispatchAssumptions pass. Because it will lead to
`tensor.cast -> set_encoding -> tensor.cast` sequence in a dispatch
while we expect the bindings have encoded tensor types. E.g.,

Input IR:

```mlir
%0 = iree_tensor_ext.dispatch_load ... tensor<?x?xi8>
%1 = set_encoding %0 : tensor<?x?xi8> -> tensor<?x?xi8, #encoding>
iree_tensor_ext.dispatch_store %1, ... tensor<?x?xi8, #encoding> ->
  ... tensor<?x?xi8, #encoding>
```

After annotation:

```mlir
%0 = iree_tensor_ext.dispatch_load ... tensor<4x?xi8>
%cast = tensor.cast %0 : tensor<4x?xi8> -> tensor<?x?xi8>
%1 = set_encoding %cast : tensor<?x?xi8> -> tensor<?x?xi8, #encoding>
%cast_0 = tensor.cast %1 : tensor<?x?xi8, #encoding> to tensor<4x5xi8>
iree_tensor_ext.dispatch_store %cast_0, ... tensor<4x5xi8> ->
  ... tensor<?x?xi8, #encoding>
```

It is hard to materialize the encodings when cast op is present.

Given that the original goal is testing dynamic shape, modifying the
input program is an easier fix.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
@hanhanW hanhanW enabled auto-merge (squash) July 23, 2025 00:27
@hanhanW hanhanW merged commit 5d3c479 into iree-org:main Jul 23, 2025
43 checks passed
@hanhanW hanhanW deleted the update-quantized-matmul-matmul-test branch July 23, 2025 00:42
AWoloszyn pushed a commit that referenced this pull request Dec 1, 2025
The compiler is very smart on static shape inference that can generate
partially dynamic shape during the lowering. It makes data-tiling fusion
very struggle because they are expected to be dynamic shape but some
dimensions are inferred to static values in Stream
AnnotateDispatchAssumptions pass. Because it will lead to `tensor.cast
-> set_encoding -> tensor.cast` sequence in a dispatch, while we expect
the bindings have encoded tensor types. E.g.,

Input IR:

```mlir
%0 = iree_tensor_ext.dispatch_load ... tensor<?x?xi8>
%1 = set_encoding %0 : tensor<?x?xi8> -> tensor<?x?xi8, #encoding>
iree_tensor_ext.dispatch_store %1, ... tensor<?x?xi8, #encoding> ->
  ... tensor<?x?xi8, #encoding>
```

After annotation:

```mlir
%0 = iree_tensor_ext.dispatch_load ... tensor<4x?xi8>
%cast = tensor.cast %0 : tensor<4x?xi8> -> tensor<?x?xi8>
%1 = set_encoding %cast : tensor<?x?xi8> -> tensor<?x?xi8, #encoding>
%cast_0 = tensor.cast %1 : tensor<?x?xi8, #encoding> to tensor<4x5xi8>
iree_tensor_ext.dispatch_store %cast_0, ... tensor<4x5xi8> ->
  ... tensor<?x?xi8, #encoding>
```

It is hard to materialize the encodings when cast op is present.

Given that the original goal is testing dynamic shape, modifying the
input program is an easier fix.

The issue is observed from #21441.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants