[IE CLDNN] Add new operation GatherND-5 to IE clDNN plugin#4857
Conversation
b57cde3 to
2958651
Compare
03f0913 to
ccc6ede
Compare
|
CI issues:
|
ccc6ede to
b936f06
Compare
inference-engine/thirdparty/clDNN/tests/test_cases/gather_nd_gpu_test.cpp
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/tests/test_cases/fusings_gpu_test.cpp
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/tests/test_cases/fusings_gpu_test.cpp
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/kernel_selector/core/cl_kernels/gather_nd_ref.cl
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/kernel_selector/core/cl_kernels/gather_nd_ref.cl
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/kernel_selector/core/cl_kernels/gather_nd_ref.cl
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/kernel_selector/core/cl_kernels/gather_nd_ref.cl
Outdated
Show resolved
Hide resolved
...-engine/thirdparty/clDNN/kernel_selector/core/actual_kernels/gather/gather_nd_kernel_ref.cpp
Outdated
Show resolved
Hide resolved
...-engine/thirdparty/clDNN/kernel_selector/core/actual_kernels/gather/gather_nd_kernel_ref.cpp
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/kernel_selector/core/cl_kernels/gather_nd_ref.cl
Outdated
Show resolved
Hide resolved
inference-engine/thirdparty/clDNN/kernel_selector/core/cl_kernels/gather_nd_ref.cl
Show resolved
Hide resolved
b936f06 to
b5bf219
Compare
|
@vladimir-paramuzov Please review this. |
b5bf219 to
cdcf20f
Compare
|
@sshlyapn Please review this |
|
@vladimir-paramuzov @sshlyapn Please review and merge this if no issue. |
lznamens
left a comment
There was a problem hiding this comment.
In general LGTM. Please make some cosmetic fixes.
| { {15, 12, 20, 15, 2}, {15, 12, 18, 7, 17} })), // Data shape | ||
| ::testing::ValuesIn(std::vector<std::vector<size_t>>( | ||
| { {15, 12, 2}, {15, 12, 5, 9, 1, 3} })), // Indices shape | ||
| ::testing::ValuesIn(std::vector<int>({ 1, 2 })) // Batch dims |
There was a problem hiding this comment.
Please align this comment as others.
| const uint dim1 = get_global_id(1); | ||
| const uint dim2 = get_global_id(2); | ||
|
|
||
| //Calculate indice index |
There was a problem hiding this comment.
Please add space after "//".
|
|
||
| const uint idx_arr[INPUT1_DIMS*2] = {idx_b, idx_f, idx_z, idx_y, idx_x, 0, 0, 0, 0, 0}; | ||
| const uint idx_dim[INPUT1_DIMS] = {INPUT1_BATCH_NUM, INPUT1_FEATURE_NUM, INPUT1_SIZE_Z, INPUT1_SIZE_Y, INPUT1_SIZE_X}; | ||
| #elif INPUT1_DIMS == 6 |
There was a problem hiding this comment.
Why do you use "elif" construction here? If there may be another INPUT1_DIMS cases then unexpected result will be here. My point is that we should use "else" construction in cases when all possible variants are under "if" or use "#error" message after #elif INPUT1_DIMS == 6 when there may be something else.
There was a problem hiding this comment.
Oh, as I can see, we have all input and output dimenshion checks in code lines from 20 to 42. What do you think, may be, it'll be better to add 3 preprocessing "#error" directives here (into lines 20 - 42). And after that in the kernel body to use such construction as:
#if DIMS_NUM == 4
...
#elif DIMS_NUM == 5
...
#else
...
#endif
In this case all unsupported dimension checks will be in a one place before the kernel body. I'm sorry that I didn't see it in my first review attempt...
There was a problem hiding this comment.
How about DIMS_NUM == 2 or 3? I think DIMS_NUM < 4 case should go else(#error). So the #elif DIMS_NUM ==6 and #else are need.
There was a problem hiding this comment.
You can process all these variants here:
#if INPUT0_DIMS == 4
#define IN_ORDER in_b,in_f,in_y,in_x
#elif INPUT0_DIMS == 5
#define IN_ORDER in_b,in_f,in_z,in_y,in_x
#elif INPUT0_DIMS == 6
#define IN_ORDER in_b,in_f,in_w,in_z,in_y,in_x
#else
#error gather_nd_ref.cl - not supported INPUT0_DIMS_NUM
#endif
And the same for INPUT1_DIMS and OUTPUT_DIMS. As for me, this will simplify your code in the kernel body. You'll may use only:
#if INPUT1_DIMS == 4
...
#elif INPUT1_DIMS == 5
...
#else
...
#endif
And all checks for DIMS NUM will be placed before the kernel body.
There was a problem hiding this comment.
Okay, I made the code simple as you your suggestion.
There was a problem hiding this comment.
I thought that you'll make error processing before the kernel body but I suppose your variant also'll work corectly because in GetSupportedKeys method we have only simple bfyx, bfzyx, bfwzyx formats for input and output layouts. Thanks.
| const uint in_x = indices_val[4]; | ||
| const uint in_y = indices_val[3]; | ||
| const uint in_z = indices_val[2]; | ||
| #elif INPUT0_DIMS == 6 |
There was a problem hiding this comment.
The same as for line 82 is here.
| const uint out_x = idx_arr[BATCH_DIMS+3]; | ||
| const uint out_y = idx_arr[BATCH_DIMS+2]; | ||
| const uint out_z = idx_arr[BATCH_DIMS+1]; | ||
| #elif OUTPUT_DIMS == 6 |
There was a problem hiding this comment.
The same as for line 82 is here.
| const uint y_pitch = OUTPUT_SIZE_X; | ||
| const uint z_pitch = y_pitch * OUTPUT_SIZE_Y; | ||
| const uint f_pitch = z_pitch * OUTPUT_SIZE_Z; | ||
| #elif OUTPUT_DIMS == 6 |
There was a problem hiding this comment.
The same as for line 82 is here.
|
|
||
| const uint z = f_remain / z_pitch; | ||
| const uint y = z_remain / y_pitch; | ||
| #elif OUTPUT_DIMS == 6 |
There was a problem hiding this comment.
The same as for line 82 is here.
| const uint f = b_remain / f_pitch; | ||
| const uint x = y_remain; | ||
|
|
||
| FUSED_OPS; |
There was a problem hiding this comment.
Maybe it makes sence to try use FUSED_OPS_PRELOAD and FUSED_OPS_CALC macro for getting better perf values?
| layout gather_nd_inst::calc_output_layout(gather_nd_node const& node) { | ||
| auto op = node.get_primitive(); | ||
|
|
||
| auto input_layout_orgin = node.input(0).get_output_layout(); |
There was a problem hiding this comment.
I amn't sure but maybe "input_layout_origin"?
There was a problem hiding this comment.
Ooops. It was a typo. I've fixed.
Signed-off-by: Kelvin Choi <kelvin.choi@intel.com>
49c389e to
c53e66b
Compare
|
@lznamens @vladimir-paramuzov CI passed. Please review this. |
Signed-off-by: Kelvin Choi <kelvin.choi@intel.com>
c53e66b to
3fe08cc
Compare
|
@lznamens @vladimir-paramuzov Please merge this unless there is more comment. |
Comments have been addressed
Signed-off-by: Kelvin Choi kelvin.choi@intel.com
ID: 48366
Details: