[GPU] Fix rank-changing reorder fusion for depth_to_space#35099
Merged
isanghao merged 1 commit intoopenvinotoolkit:masterfrom Apr 2, 2026
Merged
[GPU] Fix rank-changing reorder fusion for depth_to_space#35099isanghao merged 1 commit intoopenvinotoolkit:masterfrom
isanghao merged 1 commit intoopenvinotoolkit:masterfrom
Conversation
isanghao
reviewed
Apr 1, 2026
isanghao
reviewed
Apr 1, 2026
6a13e4d to
7976ad4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of the issue:
Issues
Symptom:
clBuildProgramkernel compilation failure when running super-resolution model (pnat-v1-fp16-576x672-2x-ox.onnx) on GPU. The failing node isdepthtospace:/gnet/DepthToSpacewithdepth_to_space_refimplementation.Root causes
depth_to_spaceis rank-preserving by op semantics — input and output must have the same number of dimensions. However,can_fuse_reorder()andcan_fuse_reorder_to_prev()inlayout_optimizer.cppunconditionally returnedtruefordepth_to_space, allowing a rank-changing reorder (bfyx → bfzyx) to be fused in.The fusion chain:
reorder_inputspass inserts reorder_42 (bfyx → bfzyx) after depth_to_space for downstream Reshape_2remove_redundant_reorderspass callscan_fuse_reorder_to_prev()which returnstrueunconditionallyINPUT0_GET_INDEXmacro argument count mismatch →clBuildProgramcrashHow to fix it
Add a rank equality guard in two functions of
layout_optimizer.cpp— separatedepth_to_spacefrom the blanket "return true" group and allow reorder fusion only when source and target ranks match:Same-rank reorder fusion (e.g., bfyx → b_fs_yx_fsv16) remains allowed. The rank-changing reorder stays as a separate optimized node with memory sharing (zero-copy), so there is no performance penalty.
After fix rank-changing reorder fusion
Reproduced steps
Tickets:
AI Assistance: