Support more image_channel_types in image_accessors read/write API for half4 datatype on host device.#1502
Conversation
dc20df7 to
4da30c4
Compare
There was a problem hiding this comment.
Suggest using std::array with size = sizeof(sycl::cl_half4). And avoid using magic number on line 99 and 80.
There was a problem hiding this comment.
I think you are referring to range<1>{10} as the magic number.
The number of bytes needed in HostPtr is related to the image Element size . (See function: getImageElementSize () in image_impl.cpp )
With image of order rgba , with channel_type::fp16, each image element will take 8 bytes.
With image of order rgba, with channel_type::snorm_int8, each image element will take 4 bytes.
So when range is 10, We will need a total of 80 bytes (40 bytes for second case) in HostPtr and to safely write/read all elements.
You are right that 10 is a random number taken. The value of 100 is more of a safe limit taken. To make it exact I will have to initialize it for each image kind separately.
… API for half4 datatype on host device. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Enabled it for GPU. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
fp16. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
edd3f55 to
7d48115
Compare
|
@AlexeySachkov @romanovvlad Let me know if you feel something is further needed in this PR. |
|
@romanovvlad Hi Vlad, Please have a quick look whenever you get sometime. |
|
@romanovvlad @AlexeySachkov @bader |
…versioning * origin/sycl: [XPTI][Framework] Reference implementation of the Xpti framework to be used with instrumentation in SYCL (intel#1557) [SYCL] Initial ABI checks implementation (intel#1528) [SYCL] Support connection with multiple plugins (intel#1490) [SYCL] Add a new header file with the reduction class definition (intel#1558) [SYCL] Add test for SYCL kernels with accessor and spec constant (intel#1536) [SYCL][CUDA] Move interop tests (intel#1570) [Driver][SYCL] Remove COFF object format designator for Windows device compiles (intel#1574) [SYCL] Fix conflicting visibility attributes (intel#1571) [SYCL][DOC] Update the SYCL Runtime Interface document with design details (intel#680) [SYCL] Improve image accessors support on a host device (intel#1502) [SYCL] Make queue's non-USM event ownership temporary (intel#1561) [SYCL] Added support of rounding modes for non-host devices (intel#1463) [SYCL] SemaSYCL significant refactoring (intel#1517) [SYCL] Support 0-dim accessor in handler::copy(accessor, accessor) (intel#1551)
Also, clean up done of image_accessor_readwrite test to enable it for GPU.