-
Notifications
You must be signed in to change notification settings - Fork 769
[UR][CI] Add first version of UR workflow #16827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f61518c
to
7762529
Compare
# Date: Wed Jan 22 11:06:11 2025 +0100 | ||
# [common] Bump UMF to early 0.11 version, from main | ||
# It includes i.a. MacOS fix for compiler. | ||
set(UNIFIED_RUNTIME_TAG d193046de592482c47d87fdfaf92c7b8c59c9b66) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this tag update necessary for the workflow changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a dummy change to trigger sycl
changes (in detect_changes job) - replaced now with dummy changes for unified-runtime
dir - this commit will be removed before merging, of course
7762529
to
f9cecd5
Compare
f9cecd5
to
9a36d44
Compare
ba4ea20
to
d85dd23
Compare
d85dd23
to
1a68aa0
Compare
During the meeting we think that we should merge this before the PR which pulls in the UR source code once we have the jobs working correctly. Then, in the time between merging this and pulling in the UR sources we disable the workflow to avoid overloading the runners. |
So, FYI, the current status is that the CUDA runner went wild, not sure what happened (I'm betting a bad |
1a68aa0
to
c45824f
Compare
c45824f
to
4ac5046
Compare
there's new CUDA runner enabled, we should be good now. |
e0d7e97
to
b28c442
Compare
{name: L0, runner: UR_L0}, | ||
{name: L0_V2, runner: UR_L0}, | ||
{name: L0, runner: UR_L0, static: ON}, | ||
{name: OPENCL, runner: UR_OPENCL, platform: "Intel(R) OpenCL"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setting the platform name with a string seems error prone, if the spacing or anything is wrong it will fail right? maybe we could use a single word like opencl
that, if required, gets updated to the required value for the actual build by the ur-build workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO.
Somethings gone wrong with the HIP job https://github.com/intel/llvm/actions/runs/13313402495/job/37182358932?pr=16827 |
.github/workflows/ur-build-hw.yml
Outdated
strategy: | ||
matrix: | ||
adapter: [ | ||
{ | ||
name: "${{inputs.adapter_name}}", | ||
other_name: "${{inputs.other_adapter_name}}", | ||
platform: "${{inputs.platform}}", | ||
static_Loader: "${{inputs.static_loader}}", | ||
static_adapter: "${{inputs.static_loader}}" | ||
} | ||
] | ||
build_type: [Release] | ||
compiler: [{c: gcc, cxx: g++}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this so that if one of the matrix jobs fails it doesn't cancel the rest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be follow up work so shouldn't block merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@sarnex @aelovikov-intel is it correct there's a public holiday in the US on Monday? If so, I would appriciate an approval so that we can merge this once we've addressed the outstanding issues ahead of the repo move on Tuesday. |
yeah theres a holiday on mon, discussed it with andrei offline, but please address all feedback in a follow up |
Yes for sure, thank you! |
@kbenzie sorry to be clear, is it okay if i merge this now? we would prefer that to catch any issues while we are working. unaddressed feedback is expected to be addressed in followup pr |
oh i see the UR hip is failing, that needs to be fixed or disabled before merging |
There's an issue with the HIP runner which @lukaszstolarczuk is looking into. |
ok cool just ping me when its ready for merge today, or ask a gatekeeper not in US to merge on monday |
I already have some changes implemented locally. For me it's no difference, I can open new PR with changes. |
If the HIP runner issue isn't resolved until Monday I'll be sure to keep a close eye on things post merge in case something goes awry. Otherwise if you fix it today @lukaszstolarczuk please go ahead with the request to merge. I need to log off for now, I'll be back online briefly in a few hours to check on things. |
Okay, lets keep it separate to derisk the move. |
I think I fixed the HIP issue - could you please re-run the HIP job? |
Done, I actually need to go now. Thank you! |
Use custom runner names - UR_*
b28c442
to
6bb3a12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only pushed for now the fail-fast: false
and TODO comments. I'm good with following up in the next PR, next week.
@sarnex, If you're good with the few red E2E tests on CI, please merge this.
// UR_HIP machine was now run twice, I believe it's fixed properly.
.github/workflows/ur-build-hw.yml
Outdated
strategy: | ||
matrix: | ||
adapter: [ | ||
{ | ||
name: "${{inputs.adapter_name}}", | ||
other_name: "${{inputs.other_adapter_name}}", | ||
platform: "${{inputs.platform}}", | ||
static_Loader: "${{inputs.static_loader}}", | ||
static_adapter: "${{inputs.static_loader}}" | ||
} | ||
] | ||
build_type: [Release] | ||
compiler: [{c: gcc, cxx: g++}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
-DUR_DPCXX=${{github.workspace}}/dpcpp_compiler/bin/clang++ | ||
-DUR_SYCL_LIBRARY_DIR=${{github.workspace}}/dpcpp_compiler/lib | ||
-DCMAKE_INSTALL_PREFIX=${{github.workspace}}/install | ||
${{ matrix.adapter.name == 'HIP' && '-DUR_CONFORMANCE_AMD_ARCH=gfx1030' || '' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to TODO.
|
||
- name: Build | ||
# This is so that device binaries can find the sycl runtime library | ||
run: cmake --build ${{github.workspace}}/build -j $(nproc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO.
{name: L0, runner: UR_L0}, | ||
{name: L0_V2, runner: UR_L0}, | ||
{name: L0, runner: UR_L0, static: ON}, | ||
{name: OPENCL, runner: UR_OPENCL, platform: "Intel(R) OpenCL"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO.
@lukaszstolarczuk So i have a PR here, and it looks like UR CI is running even though it wasn't changed. Is that expected? |
in a sense, yes - it was expected - the You can go ahead and disable it, please. |
I've disabled the workflow for now, good to see it working. I'll reneable it before we the repo move. |
Last commit included just to verify iftest_job
will be executed.With extra dummy changes (now missing in PR) we just didn't skip the
test_job
, e.g. here: https://github.com/intel/llvm/actions/runs/13040807649/job/36382037543