Rewrite OpenCL device discovery to SYCL. #1401

clevels · 2020-03-26T20:24:28Z

Signed-off-by: Cory Levels [email protected]

Signed-off-by: Cory Levels <[email protected]>

sycl/tools/get_device_count_by_type.cpp

bader · 2020-03-27T09:37:43Z

sycl/tools/get_device_count_by_type.cpp

@@ -8,6 +8,7 @@

 #include <CL/cl.h>
 #include <CL/cl_ext.h>
+#include <CL/sycl.hpp>

 #ifdef USE_PI_CUDA


Do we need CUDA specific code now?

bader · 2020-03-27T09:38:11Z

sycl/tools/CMakeLists.txt

@@ -4,7 +4,9 @@ set(CMAKE_CXX_EXTENSIONS OFF)

 add_executable(get_device_count_by_type get_device_count_by_type.cpp)
 add_dependencies(get_device_count_by_type ocl-headers ocl-icd)


Suggested change

add_dependencies(get_device_count_by_type ocl-headers ocl-icd)

I cannot remember why this was added but I ran into situations where the use of the OpenCL::Headers target was not enough. I suggest to test this when removing it, e.g., by starting from a completely fresh repo with and without the OCL headers and Khronos ICD.

romanovvlad · 2020-03-27T13:19:41Z

sycl/tools/get_device_count_by_type.cpp

-        if (iRet == CL_SUCCESS) {
-            deviceCount += deviceCountPart;
-        }
+    std::vector<platform> platforms(platform::get_platforms());


Don't think it's right approach to use SYCL here.
Instead I suggest removing get_device_count_by_type completely and updating tests so they try to run on all device types so if such a device is not available just print "Skip".

Agree on not using SYCL RT. For example, if there is a bug in platforms or device implementation, or something triggered by the plugin discovered, all test will fail and could be very difficult to figure out what the problem is.

There is no reason to run all SYCL LIT testing if SYCL RT platforms/device discovery is broken, i.e. can't detect correctly what's there. I think that for the sake of SYCL LIT testing we should run it on all platforms/devices detected by SYCL RT. There should be a separate single test that verifies that SYCL RT is doing the correct discovery as compared to what's discovered by low-level run-times.

Agree on not using SYCL RT. For example, if there is a bug in platforms or device implementation, or something triggered by the plugin discovered, all test will fail and could be very difficult to figure out what the problem is.

To check that the plugin functionality or platform/device implementations are working fine, we can add a test case which checks the listed devices by calling all low level APIs and then checking it w.r.t. the devices returned from plugin. This doesn't need to be checked by all tests.

We would ideally like to start recognizing all devices from all plugins. As that code is being changed, this logic will also need to change in parallel. It is better to use the SYCL APIs, so that it is maintained at one location.

Don't think it's right approach to use SYCL here.
Instead I suggest removing get_device_count_by_type completely and updating tests so they try to run on all device types so if such a device is not available just print "Skip".

I agree, the tests could be written such that they execute on all attached devices and report any error found with any attached device.
Any test needing a special device (opencl only)/ avoiding any specific device testing (like skipping cuda), should use custom device_selectors. We can create common utilities for this purpose. This approach will mean lesser maintenance and customization.

To check that the plugin functionality or platform/device implementations are working fine, we can add a test case which checks the listed devices by calling all low level APIs and then checking it w.r.t. the devices returned from plugin. This doesn't need to be checked by all tests.

Seems to me that, somehow, that test should run before any other, so that if that code is broken, no other test is run and we avoid a massive list of failure which can waste a lot of time.

I am not sure if LIT has a mode to stop all further testing if a specific test breaks.

clevels · 2020-03-27T17:09:38Z

Hey Vlad, Yes I know there is some discussion about this change with @garimagu and @smaslov-intel . It was unclear to me what the next steps should be so I thought the conversation should continue here.

keryell

Besides this PR this code looks strange at the first place.
Like someone tried for each line to do a demo of a different way/style to express a thing in C/C++...
For example line 35 & 36.

Ruyk · 2020-04-01T15:46:13Z

sycl/tools/get_device_count_by_type.cpp

+    std::vector<platform> platforms(platform::get_platforms());
+    for (cl_uint i = 0; i < platforms.size(); i++) {
+      std::vector<device> result = platforms[i].get_devices(device_type);
+      deviceCount += result.size();


If the SYCL interface is used, then the CUDA backend check could be done in this loop as well, are you planning to do that?

bjoernknafla · 2020-04-02T08:02:44Z

At the moment our LIT tests are annoted with REQUIRES: opencl or UNSUPPORTED: cuda. With the rewrite this tool won't deliver the information if we are have OpenCL interop or a CUDA backend anymore.

Dealing with this could/should be part of another PR though I wonder what the approach should be?

We could completely drive this from LIT parameters or env vars (or a combination of both), though then we still would need a way to tell this tool to only focus on platforms/backends that have the expected features.

As an example, the following PR does a less high-level fix for the tool but adds extra backend parameters for it to focus either on OpenCL or on CUDA, and the LIT config has been adapted accordingly: #1300

bjoernknafla · 2020-04-17T11:26:17Z

#1300 got merged which cleans up the functionality expected from get-device-count-by-type to interface with LIT to enable OpenCL or CUDA specific testing.

Co-Authored-By: Alexey Bader <[email protected]>

vladimirlaz · 2020-04-30T09:27:05Z

@clevels, could you rebase the PR if it is still needed? I guess alike changes were implemented in #1458.

romanovvlad · 2020-05-08T13:19:16Z

Still think that it does not make sense to rewrite get_device_count_by_type to use SYCL API.
I think we need to:

remove the tool completely
add new cmake targets per each backend and device type(e.g. check-sycl-cuda-gpu, check-sycl-opencl-fpga, ..)
modify all tests so they are skipped if device is not found
have a separate target which runs tests which make sure SYCL sees devices available on the system using low level APIs(OpenCL, CUDA).

bjoernknafla · 2020-05-08T15:36:39Z

I like the idea of makeing tests more "atomic" in the device they are testing. It feels cleaner - though it might add time overhead to run due to going through more LIT/Python (just a guess).

Though even with that route, won't we need a way to query the platform we run on to understand which tests can be run, e.g., which backends are available, and what kind of devices per backend, and perhaps even what kind of extensions per device?

vladimirlaz · 2020-05-11T05:45:34Z

Here is an example how it can be made: https://github.com/vladimirlaz/llvm-test-suite/tree/sycl_poc/SYCL/Wimpy. Target device is passed to LIT framework as a parameter.

smaslov-intel · 2020-05-11T18:28:18Z

tag @rbegam

romanovvlad · 2020-05-12T13:36:44Z

won't we need a way to query the platform we run on to understand which tests can be run

In the proposal tests just early exit with message "Skipped" if they cannot find a device requested.
We will have a lot of overhead on compiling tests that will be skipped though. It can be solved in CI by running only "OpenCL tests"(target) in "OpenCL" jobs and "CUDA tests"(targets) in "CUDA" jobs.

romanovvlad · 2020-06-05T13:26:03Z

Closing due to lack of activity. Feel free to reopen if needed.

clevels force-pushed the rewrite_opencl branch 2 times, most recently from 35b9347 to dbdad23 Compare March 26, 2020 20:45

Rewrite OpenCL device discovery to SYCL.

6e06110

Signed-off-by: Cory Levels <[email protected]>

clevels force-pushed the rewrite_opencl branch from dbdad23 to 6e06110 Compare March 26, 2020 20:56

bader requested changes Mar 27, 2020

View reviewed changes

romanovvlad requested changes Mar 27, 2020

View reviewed changes

keryell reviewed Mar 28, 2020

View reviewed changes

romanovvlad requested review from smaslov-intel and garimagu March 30, 2020 13:37

Ruyk reviewed Apr 1, 2020

View reviewed changes

Update sycl/tools/get_device_count_by_type.cpp

4882fa6

Co-Authored-By: Alexey Bader <[email protected]>

clevels requested a review from a team as a code owner April 28, 2020 15:45

clevels requested a review from sergey-semenov April 28, 2020 15:45

romanovvlad closed this Jun 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite OpenCL device discovery to SYCL. #1401

Rewrite OpenCL device discovery to SYCL. #1401

clevels commented Mar 26, 2020

bader Mar 27, 2020

bader Mar 27, 2020

bjoernknafla Apr 2, 2020

romanovvlad Mar 27, 2020

Ruyk Mar 30, 2020

smaslov-intel Mar 30, 2020

garimagu Mar 30, 2020

garimagu Mar 30, 2020

Ruyk Apr 1, 2020

bjoernknafla Apr 2, 2020

clevels commented Mar 27, 2020

keryell left a comment

Ruyk Apr 1, 2020

bjoernknafla commented Apr 2, 2020 •

edited

Loading

bjoernknafla commented Apr 17, 2020

vladimirlaz commented Apr 30, 2020

romanovvlad commented May 8, 2020

bjoernknafla commented May 8, 2020 •

edited

Loading

vladimirlaz commented May 11, 2020

smaslov-intel commented May 11, 2020

romanovvlad commented May 12, 2020

romanovvlad commented Jun 5, 2020

		@@ -4,7 +4,9 @@ set(CMAKE_CXX_EXTENSIONS OFF)

		add_executable(get_device_count_by_type get_device_count_by_type.cpp)
		add_dependencies(get_device_count_by_type ocl-headers ocl-icd)

Rewrite OpenCL device discovery to SYCL. #1401

Rewrite OpenCL device discovery to SYCL. #1401

Conversation

clevels commented Mar 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clevels commented Mar 27, 2020

keryell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjoernknafla commented Apr 2, 2020 • edited Loading

bjoernknafla commented Apr 17, 2020

vladimirlaz commented Apr 30, 2020

romanovvlad commented May 8, 2020

bjoernknafla commented May 8, 2020 • edited Loading

vladimirlaz commented May 11, 2020

smaslov-intel commented May 11, 2020

romanovvlad commented May 12, 2020

romanovvlad commented Jun 5, 2020

bjoernknafla commented Apr 2, 2020 •

edited

Loading

bjoernknafla commented May 8, 2020 •

edited

Loading