OpenCL: core support for more formats, including float16 by JoeHowse · Pull Request #20288 · opencv/opencv

JoeHowse · 2021-06-21T05:35:16Z

Changes

This draft pull request attempts to improve the support for float16, as well as some additional channel formats, via the OCL back-end. Specifically, the changes include the following:

Support conversion for CL_HALF_FLOAT (float16)
Support conversion for additional channel orders: CL_A, CL_INTENSITY, CL_LUMINANCE, CL_RG, CL_RA
Comment on why conversion is unsupported for CL_RGB. (See CV_8UC3 Mat not convertible to ocl::Image2D #8108 and the latest OpenCL documentation on supported types for CL_RGB.)
Predict optimal vector width for float16
Support string conversion for float16 kernels
Support querying OpenCL's float16 support (via Device::halfFPConfig) for any OpenCL version. Previously, the code enforced OpenCL version >= 1.2, but this limitation was artificial. Even OpenCL 1.0 supports the underlying config property, CL_DEVICE_HALF_FP_CONFIG, as shown in the OpenCL 1.0 documentation.
Support for float16 in ocl_gemm implementation
Performance test cases for ocl_gemm with float16 and float64
Supporting default range [-1.0, 1.0] for float16 in randu. (randu already supported float16 with a specified range. With the change, randu also supports float16 with a default range, the same way as randu already supported a default range for other types.)

force_builders=Custom Mac
test_opencl:Custom Mac=ON

* Support conversion for CL_HALF_FLOAT (float16) * Support conversion for additional channel orders: CL_A, CL_INTENSITY, CL_LUMINANCE, CL_RG, CL_RA * Comment on why conversion is unsupported for CL_RGB * Predict optimal vector width for float16

* Support float16 in ocl::kernelToStr

* Drop the artificial requirement for OpenCL version >= 1.2 in ocl::Device::halfFPConfig. Even OpenCL 1.0 supports the underlying config property, CL_DEVICE_HALF_FP_CONFIG. * Update opencl_info.hpp to provide info on OpenCL half-float support, like pre-existing info on double-float support.

* Report preferred half-float vector width when dumping OpenCL info

* Support for float16 in ocl_gemm implementation * Performance test cases for ocl_gemm with float16 and float64 * Supporting default range [-1.0, 1.0] for float16 in randu

* Accept float16 input in TestBase::warmup

JoeHowse · 2021-06-27T18:52:33Z

Support for float16 computations is proving to be more problematic than I originally expected. After trying many hardware/driver combinations, I have concluded that OpenCL drivers rarely implement support for float16 computations, even if the hardware is capable of float16 vectorization.

I am leaving this pull request open as a draft for discussion. Meanwhile, I plan to factor out some of the other changes (the ones that do not depend on OpenCL float16 computations) as a separate pull request.

See details below.

Tested hardware/driver combinations

To check whether OpenCL float16 computations are supported, I am calling cv::ocl::Device::halfFPConfig, which simply queries OpenCL's CL_DEVICE_HALF_FP_CONFIG property.

On Windows, I tried the following combinations and found that none of them support float16 computations in OpenCL:

NVIDIA GeForce RTX 3080 with the latest Game Ready Driver
AMD Radeon Pro WX 9100 with the latest Enterprise Driver
AMD Radeon RX 580 with the latest Adrenaline Driver
AMD Radeon RX 580 with the latest Enterprise Driver
Udoo Bolt V8 board with Vega 8 integrated GPU and the Embedded V-Series APU Catalyst Driver

Likewise, on Mac, I tried the following combinations and found that none of them support float16 computations in OpenCL:

Intel Iris Pro integrated GPU in a MacBook Pro mid-2015
AMD Radeon Pro WX 9100 as an external GPU
AMD Radeon RX 560 as an external GPU

On Linux, results were mixed:

NVIDIA GeForce RTX 3080 with the closed-source NVIDIA drivers: Reports that it does not support float16 computations.
Udoo Bolt V8 board with Vega 8 integrated GPU and the open-source Mesa (OpenCL 1.1) drivers: Reports that it does support float16 computations. However, in general, when this system runs OpenCV's OpenCL tests, it freezes with a black screen.
ODROID-XU4 board with ARM Mali-T628MP6 integrated GPU and the open-source Panfrost drivers: Reports that it does support float16 computations. However, in general, when this system runs OpenCV's OpenCL tests, it reports memory allocation errors. With the changes in this draft pull request, it can get as far as trying to compile a float16 kernel, but then the compiler reports that the half type is unsupported.

To-do: On Linux, I should try other types of AMD hardware and AMD drivers (AMDGPU, AMD ROCm).

Plans to refactor

The following subset of the changes does not depend on OpenCL float16 computations and could be refactored into a separate pull request:

Support clImage conversion for additional channel orders: CL_A, CL_INTENSITY, CL_LUMINANCE, CL_RG, CL_RA
Comment on why clImage conversion is unsupported for CL_RGB.
Support querying OpenCL's float16 support (via Device::halfFPConfig) for any OpenCL version. Previously, the code enforced OpenCL version >= 1.2, but this limitation was artificial.
Add performance test cases for ocl_gemm with float64. These cases are skipped if the system does not support OpenCL float64 computations.
Support default range [-1.0, 1.0] for float16 in randu. (randu already supported float16 if the range was specified in arguments. With the change, randu also supports float16 with a default range, the same way as randu already supported a default range for other types.)

alalek

Improvement of gemm OpenCL code path only without touching of C++ / CPU generic code doesn't makes sense.

Adding performance tests without accuracy doesn't makes sense too.

My suggestion is to drop gemm() changes from this patch as incomplete.

alalek · 2021-06-29T21:46:15Z

modules/core/perf/opencl/perf_gemm.cpp

+    randu(src3, -10.0, 10.0);

    OCL_TEST_CYCLE() cv::gemm(src1, src2, 0.6, src3, 1.5, dst, flags);



Below:

if (CV_MAT_DEPTH(type) == CV_16F) SANITY_CHECK_NOTHING(); else SANITY_CHECK(dst, 0.01);

Sorry, could you please explain this comment/suggestion further?

This is used to bypass test failure message due to missing sanity data in opencv_extra repository.
It makes sense to bypass check for 16F result accuracy in performance tests (result may be inaccurate on different platforms / OpenCL devices)

alalek · 2021-06-29T21:49:13Z

modules/core/src/matmul.dispatch.cpp

-    CV_Assert_N( type == matB.type(), (type == CV_32FC1 || type == CV_64FC1 || type == CV_32FC2 || type == CV_64FC2) );
+    CV_Assert_N( type == matB.type(),
+        (type == CV_32FC1 || type == CV_64FC1 || type == CV_16FC1 ||
+         type == CV_32FC2 || type == CV_64FC2 || type == CV_16FC2) );


Does it really work?

matA of 32F

matB of 16F

The assertion's first condition, type == matB.type(), ensures that the two matrices have the same type.

JoeHowse · 2021-06-30T01:05:31Z

Improvement of gemm OpenCL code path only without touching of C++ / CPU generic code doesn't makes sense.

Adding performance tests without accuracy doesn't makes sense too.

My suggestion is to drop gemm() changes from this patch as incomplete.

I would be happy to provide a refactored patch, excluding the GEMM changes and anything else that is currently incomplete or problematic.

@alalek What are your thoughts on the changes in ocl.cpp?

alalek · 2021-06-30T08:54:29Z

The changes besides gemm() (ocl.cpp, ts update) looks good to me.

JoeHowse · 2021-06-30T17:39:23Z

Thank you for the review. I have made a refactored pull request in #20336.

JoeHowse · 2021-08-03T18:18:49Z

I am closing this pull request because a refactored version of it (#20336) has been merged.

JoeHowse added 3 commits June 21, 2021 00:46

ocl::kernelToStr: support float16

52ae9b8

* Support float16 in ocl::kernelToStr

JoeHowse changed the title ~~cl_image: support more formats, including float16~~ OpenCL: core support for more formats, including float16 Jun 22, 2021

dumpOpenCLInformation: half-float vector width

d3564c2

* Report preferred half-float vector width when dumping OpenCL info

JoeHowse marked this pull request as draft June 27, 2021 00:11

JoeHowse added 2 commits June 26, 2021 21:20

ocl_gemm: float16 support

a0deff7

* Support for float16 in ocl_gemm implementation * Performance test cases for ocl_gemm with float16 and float64 * Supporting default range [-1.0, 1.0] for float16 in randu

TestBase::warmup: float16 support

7c9feec

* Accept float16 input in TestBase::warmup

alalek reviewed Jun 29, 2021

View reviewed changes

JoeHowse mentioned this pull request Jun 30, 2021

[refactored] OpenCL: core support for more formats, including float16 #20336

Merged

JoeHowse closed this Aug 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenCL: core support for more formats, including float16#20288

OpenCL: core support for more formats, including float16#20288
JoeHowse wants to merge 6 commits intoopencv:masterfrom
JoeHowse:cl_image-float16-conversions

JoeHowse commented Jun 21, 2021 •

edited by alalek

Loading

Uh oh!

JoeHowse commented Jun 27, 2021

Uh oh!

alalek left a comment

Uh oh!

alalek Jun 29, 2021

Uh oh!

JoeHowse Jun 29, 2021

Uh oh!

alalek Jun 30, 2021

Uh oh!

alalek Jun 29, 2021

Uh oh!

JoeHowse Jun 29, 2021

Uh oh!

JoeHowse commented Jun 30, 2021

Uh oh!

alalek commented Jun 30, 2021

Uh oh!

JoeHowse commented Jun 30, 2021

Uh oh!

JoeHowse commented Aug 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		randu(src3, -10.0, 10.0);

		OCL_TEST_CYCLE() cv::gemm(src1, src2, 0.6, src3, 1.5, dst, flags);

Uh oh!

Conversation

JoeHowse commented Jun 21, 2021 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

JoeHowse commented Jun 27, 2021

Tested hardware/driver combinations

Plans to refactor

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

alalek Jun 29, 2021

Choose a reason for hiding this comment

Uh oh!

JoeHowse Jun 29, 2021

Choose a reason for hiding this comment

Uh oh!

alalek Jun 30, 2021

Choose a reason for hiding this comment

Uh oh!

alalek Jun 29, 2021

Choose a reason for hiding this comment

Uh oh!

JoeHowse Jun 29, 2021

Choose a reason for hiding this comment

Uh oh!

JoeHowse commented Jun 30, 2021

Uh oh!

alalek commented Jun 30, 2021

Uh oh!

JoeHowse commented Jun 30, 2021

Uh oh!

JoeHowse commented Aug 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JoeHowse commented Jun 21, 2021 •

edited by alalek

Loading