forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 0
Revert task sequence changes #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…#13987) Bump UR L0 commit to oneapi-src/unified-runtime#1694 so that the SYCL device aspect for supporting update in graphs is correctly reported for L0 devices. Currently, support can be incorrectly reported. --------- Co-authored-by: Kenneth Benzie (Benie) <[email protected]>
Running the `Graph/Update` E2E tests on Level Zero with `UR_L0_LEAKS_DEBUG=1` shows that we are leaking a PI kernel and module. On investigation this was because we are retaining these objects in `getOrCreateKernel()` but not releasing them. Added release calls similar to how it is done in [enqueueImpCommandBufferKernel](https://github.com/intel/llvm/blob/b49303c7e13ca0a69454eaaaeb8c3d094916218d/sycl/source/detail/scheduler/commands.cpp#L2550) by the scheduler
This upgrades the docker to use the cuda 12.5 image. I've ran the test-e2e locally using cuda 12.5 and all is well. cuda 12.5 also fixed an issue introduced by the cuda 12.4 driver: see intel#13661 (comment) Signed-off-by: JackAKirk <[email protected]>
…lock_load/slm_block_store APIs accepting simd_view (intel#13978) Co-authored-by: Nick Sarnie <[email protected]>
… build libdevice with thinLTO (intel#14036) This is the first change in my work on thinLTO for SYCL. --------- Signed-off-by: Sarnie, Nick <[email protected]>
Scheduled igc dev drivers uplift Co-authored-by: GitHub Actions <[email protected]>
See intel#14053 Signed-off-by: Sarnie, Nick <[email protected]>
…mbiguity (intel#14018) This change avoids the ambiguity between the deprecated `sycl::ext::oneapi::sub_group` and `sycl::sub_group` when both namespaces are used. This fixes a failure on windows for cuda. --------- Signed-off-by: JackAKirk <[email protected]>
…ate for local_accessors API accepting simd_view (intel#14024)
intel#14000) Co-authored-by: Nick Sarnie <[email protected]>
…el#14061) See intel#14060 --------- Signed-off-by: Sarnie, Nick <[email protected]>
Supported matrix dimensions are queried from the device, and inform the tests which tile sizes one can use. This is a subset of all tests that are planned to be modified. Test manually tested on PVC and SPR --------- Co-authored-by: Yury Plyakhin <[email protected]>
Supported matrix dimensions are queried from the device, and inform the tests which tile sizes one can use. This is a subset of all tests that are planned to be modified. Test manually tested on PVC and SPR - no new regresssions The following tests have been marked as XFAIL on all platforms. I removed them from XMX8 folder. Once they are passing then they can be modified to query the supported matrix dimensions form the device. * joint_matrix_colA_rowB_colC.cpp * joint_matrix_out_bounds.cpp * joint_matrix_unaligned_k.cpp
…ntel#14067) Co-authored-by: Ewan Crawford <[email protected]>
By some reason, we used to only emit unused member functions if they are explicitly annotated with `sycl_device` attribute (through `SYCL_EXTERNAL` macro). This logic was introduced in 3baec18 and there is no clear indication as to why exactly we have a check that the attribute is explicit. SYCL extension for virtual functions introduces an alternative markup for specifying which function and that markup is SYCL compile-time properties that we turn into attributes implicitly under the hood. Essentially, we now have a situation where an implicit `sycl_device` attribute on a member function should be treated as an explicit one, because it could be a result of SYCL compile-time property being applied to that method. Considering our current codebase, it seems like we intend to have member function to be emitted in all cases where `sycl_device` is being implicitly added and therefore this patch removes the requirement for the attribute to be explicit.
This PR adds functionalities for: * Listing devices in stdout * Filtering devices Tests and docs updated accordingly. --------- Signed-off-by: Alberto Cabrera <[email protected]> Co-authored-by: Joe Todd <[email protected]>
…intel#14015) This PR adds a `wait_and_free` func. This makes it safer and less likely to release memory during or before it is used by enqueued commands. `async_free` is renamed `enqueue_free`, to make its behaviour clearer This PR updates the comments and tests accordingly
…d to run on Windows (intel#13957) [Windows doesn't support cudaMemPrefetchAsync()](bitsandbytes-foundation/bitsandbytes#453) which is used in the call to `prefetch` in the test. [urEnqueueUSMPrefetch](https://github.com/oneapi-src/unified-runtime/blob/c0c607c3a88933b4c5c20a0aca4539781c678411/source/adapters/cuda/enqueue.cpp#L1629) is also commented with a note for not having the support for CUDA on Windows.
temp fix for problems from cuda 12.5 uplift that were caused by intel#14049. Should fix intel#14071 --------- Signed-off-by: JackAKirk <[email protected]>
…e exception message (intel#14055) - C++ thrown exception message not shown when running from Windows terminal. - The patch fixes [cuda-max-local-mem-size.cpp](https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Plugin/cuda-max-local-mem-size.cpp) test failure.
… GPUs through OpenCL. (intel#14072) Extend the `sycl-ls-gpu-default.cpp` test to cover the support of Intel GPUs through OpenCL. The patch fixes the failure when running the test on a system with Intel and CUDA gpus.
Fixes ansi-alias violation and reads from uninitialized buffers. Fixes intel#13790.
…UDA (intel#14058) Fails in Nightly testing on the self-hosted CUDA runner: intel#12995.
…date ACC API accepting simd_view (intel#14065) Co-authored-by: Nick Sarnie <[email protected]>
…date USM API accepting simd_view (intel#14043)
…#14087) See intel#14086 Signed-off-by: Sarnie, Nick <[email protected]>
…ntel#14088) CODEOWNERS seems to be missing a line attributing `sycl/test/check_device_code/matrix` tests to intel/sycl-matrix-reviewers (As per [this discussion](intel#14063 (comment))). This PR remedies this. Although, I noticed the current CODEOWNERS section for the matrix reviewers uses paths; Let me know if I should use `sycl/**/matrix` instead.
…address) (intel#13948) UR: oneapi-src/unified-runtime#1677 In kernel, we save at most `ASAN_MAX_NUM_REPORTS` (default: 10) number of SanitizerReport. We select the index of SanitizerReport by `WG_LINEAR_ID % ASAN_MAX_NUM_REPORTS`. When `-fsanitize-recover=address` is passed in compiler flag, `asan_loadX_noabort`/`asan_storeX_noabort` is used, we use `is_recover = true` flag to distinguish this case. If `is_recover` is true, the UR will print out all the error reports and continue (use at your own risk). If `is_recover` is false (default case), the UR will print out only one error report and exit. --------- Co-authored-by: Callum Fare <[email protected]>
This patch fixes errors like "call to 'XXX' is ambiguous" and also disables all TaskSequence tests on Windows because they fail in the same way as on Linux. Github issue was created against that and comment in tests was updated.
In accordance with KhronosGroup/SYCL-Docs#555 proposal, this commit allows raw pointers in the `load` and `store` member functions on `sycl::vec`. --------- Signed-off-by: Larsen, Steffen <[email protected]>
It had been removed from the specification in KhronosGroup/SYCL-Docs#431. Originally introduced in intel#4298 the implementation has never been completed, so, while technically a breaking change, no customer code can be really using it.
) Re-lands intel#13199 and reverts intel#13464.
…SYCL offloading under an option (intel#14262) We have added an experimental option (off by default): -fsycl-use-spirv-backend-for-spirv-gen This will cause device compilation pipeline in SYCL offloading to invoke llc (with SPIR-V target) instead of llvm-spirv to perform SPIR-V code generation. This change will result in the SPIR-V backend being built by default. Also, if spirv-val is available, a sanity check of the generated SPIR-V code will be done and any issues will be reported as warnings to the user. Thanks --------- Signed-off-by: Arvind Sudarsanam <[email protected]>
…vice_level_zero.cpp` (intel#14310) Fixing test that was missed in intel#14162
…l#14175) In the same vein as intel#14174, this PR moves test cases that do not check device code outside of the `check_device_code` folder. This is a separate PR as intel#14174 because the code is technically owned by another team.
When using -fsycl --coverage, we should not enable code coverage for device compilations as code coverage for device is not supported at this time.
error: unused variable 'DecorateAddressIndex' [-Werror,-Wunused-variable]
Due to some confusion about the output from the in_order_profiling_queue test on L0, the test was disabled. However, the test can be safely reenabled for that target, while keeping it disabled for FPGA. Additionally, the failure in profiling_queue is believed to be due to the same issue, so the JIRA has been added to it and the note in in_order_profiling_queue has been updated to reflect the known information about the failure. Signed-off-by: Larsen, Steffen <[email protected]>
UR: oneapi-src/unified-runtime#1676 Instrument "__asan_mem_to_shadow" to convert private address to its shadow memory address Other steps are same with ASan on stack.
…n into IR-based analysis (intel#14220) Based on feedback from intel#14197, I seperated out the code that generates the module properties and symbol table into separate functions that can be called by anyone, and just looks at the IR and entry points. For now, we still call it inside `sycl-post-link` because we still support the old offloading model, but once we drop support for that we can drop this responsibility from sycl-post-link and only compute it inside `clang-linker-wrapper`, both for normal compilation and thinLTO. In a (hopefully soon) future PR I plan to call these functions from `clang-linker-wrapper` when compiling for thinLTO, which we need because we will split early. Most of this change should be NFC(I). The expected changes are: 1) New option to sycl-post-link to generate the properties file 2) Driver change to NOT pass the option from 1) in thinLTO mode 3) Two minor chages in logic from properties generation, I've called these out inline. --------- Signed-off-by: Sarnie, Nick <[email protected]>
…iver (intel#14334) A similar test was disabled in intel@672b225, and this test was found to have the same issue. Signed-off-by: Sarnie, Nick <[email protected]>
This reverts commit 92f5b98.
…add new fpga_cluster kernel property (intel#12453)" This reverts commit 7b9001e.
aejjehint
pushed a commit
that referenced
this pull request
Jun 28, 2024
… (#92855) This solves some ambuguity introduced in P0522 regarding how template template parameters are partially ordered, and should reduce the negative impact of enabling `-frelaxed-template-template-args` by default. When performing template argument deduction, we extend the provisional wording introduced in llvm/llvm-project#89807 so it also covers deduction of class templates. Given the following example: ```C++ template <class T1, class T2 = float> struct A; template <class T3> struct B; template <template <class T4> class TT1, class T5> struct B<TT1<T5>>; // #1 template <class T6, class T7> struct B<A<T6, T7>>; // #2 template struct B<A<int>>; ``` Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This patch restores the pre-P0522 behavior, `#2` is picked again. This has the beneficial side effect of making the following code valid: ```C++ template<class T, class U> struct A {}; A<int, float> v; template<template<class> class TT> void f(TT<int>); // OK: TT picks 'float' as the default argument for the second parameter. void g() { f(v); } ``` --- Since this changes provisional implementation of CWG2398 which has not been released yet, and already contains a changelog entry, we don't provide a changelog entry here.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This reverts PR intel#12453 and intel#13080