-
Notifications
You must be signed in to change notification settings - Fork 769
[CI] Add CI workflow to run compute-benchmarks on incoming syclos PRs #14454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1276f39
754c33a
9981c3a
3ed35ca
6ea0110
6d14a32
940e3be
7aafdf5
24b5169
b2d4463
5f1cd57
7a889f8
6e1b3bb
5d1dea4
991cd55
6e141e8
8a5ecb6
606c02a
5dd976e
6c83d2b
5266cac
ab17254
45a93ba
a4a1c03
52f5dc3
90575fe
8dfeff6
3346a09
3a28ded
e67c29f
88f8b3b
e6cbc2e
c6645aa
867e6e6
18ff8b1
cc29c23
3b5454f
6ed0361
f54e1d2
47693f9
f1d3a7f
652667f
be75574
1200217
97b2d4f
08da292
441bc10
8979115
7fa6a2e
1074e42
21498b4
8506c20
c2835cd
51cdaee
ef6a085
2fd6b7b
1d5c676
c20d75e
a9dad80
0c3e901
12326ce
4335bf9
ba29015
90fe17f
59e38fe
d212adc
321d83a
ba3c45c
81fb277
e8178c5
410666e
c32ad36
b5fa113
0ae396d
5d8f864
fda62fc
0a083b8
e58248d
8091ed0
d142575
c77f967
c04ccaa
e4897d5
a5b7e23
6a12cf4
f027f8e
08388fa
9f7b0ff
30bd28c
a3b0487
0e4bb7f
74bd73c
0282c0a
66a51f3
5173a0d
d5a9468
867fc5a
168325f
cf599c0
96ad8dd
a4f1d5e
7bf79fd
23a21e9
c1c7313
3bdf383
e9425f5
6e889d4
e158a70
bfedd09
d5bfa08
02428d6
54b8fd4
acd1931
38c9bed
6193059
fcbbe52
e5a12b8
cf886e9
e434d74
ec8b2f0
3936328
dff86d2
371097d
619c86b
792769e
ec369cd
3f1666f
07bca84
5d5c755
40e8b9b
7bf8043
3d366ae
98b1acf
4c13ae7
ba9da64
798e16b
0dea393
3252b59
37612f5
6f5074f
efef394
93456e0
625c72f
0faada5
c256b5c
ab29ffc
337c51e
cdea68d
beb2942
f4d8a3f
572ff7e
412449e
2951b37
bce5229
5c25a95
3e01431
82c1248
ff8675a
6751d2b
472c6be
1953dab
a48a3d2
28798c8
4e59fb6
2433d88
41d664a
bc30802
24d10fa
289d9c5
b7b8821
d028580
4438530
2ff3e49
4c1c0f0
eb5ece2
2484960
5a90573
c8c1a99
3c85d5a
d5b5a5b
5a8cbe6
55d339c
43d67cb
8a6a48a
5cc455b
97d0c2f
f5633a1
d4c8d14
07f2966
a3874d5
bbb16f4
cb6c2e2
056c724
f2603bf
489372a
0bdbc5c
137edef
e4496f2
9345dc0
8df9b21
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
name: Aggregate compute-benchmark averages from historical data | ||
|
||
# The benchmarking workflow in sycl-linux-run-tests.yml passes or fails based on | ||
# how the benchmark results compare to a historical average: This historical | ||
# average is calculated in this workflow, which aggregates historical data and | ||
# produces measures of central tendency (median in this case) used for this | ||
# purpose. | ||
|
||
on: | ||
workflow_dispatch: | ||
inputs: | ||
lookback_days: | ||
description: | | ||
Number of days from today to look back in historical results for: | ||
This sets the age limit of data used in average calculation: Any | ||
benchmark results created before `lookback_days` from today is | ||
excluded from being aggregated in the historical average. | ||
type: number | ||
required: true | ||
workflow_call: | ||
inputs: | ||
lookback_days: | ||
type: number | ||
required: true | ||
secrets: | ||
LLVM_SYCL_BENCHMARK_TOKEN: | ||
description: | | ||
Github token used by the faceless account to push newly calculated | ||
medians. | ||
required: true | ||
|
||
|
||
permissions: | ||
contents: read | ||
|
||
jobs: | ||
aggregate: | ||
name: Aggregate average (median) value for all metrics | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
sparse-checkout: | | ||
devops/scripts/benchmarking | ||
devops/benchmarking | ||
devops/actions/benchmarking | ||
- name: Aggregate benchmark results and produce historical average | ||
uses: ./devops/actions/benchmarking/aggregate | ||
with: | ||
lookback_days: ${{ inputs.lookback_days }} | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.LLVM_SYCL_BENCHMARK_TOKEN }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
name: 'Aggregate compute-benchmark results and produce historical averages' | ||
|
||
# The benchmarking workflow in sycl-linux-run-tests.yml passes or fails based on | ||
# how the benchmark results compare to a historical average: This historical | ||
# average is calculated in this composite workflow, which aggregates historical | ||
# data and produces measures of central tendency (median in this case) used for | ||
# this purpose. | ||
# | ||
# This action assumes that /devops has been checked out in ./devops. This action | ||
# also assumes that GITHUB_TOKEN was properly set in env, because according to | ||
# Github, that's apparently the recommended way to pass a secret into a github | ||
# action: | ||
# | ||
# https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions#accessing-your-secrets | ||
# | ||
|
||
inputs: | ||
lookback_days: | ||
type: number | ||
required: true | ||
|
||
runs: | ||
using: "composite" | ||
steps: | ||
- name: Obtain oldest timestamp allowed for data in aggregation | ||
shell: bash | ||
run: | | ||
# DO NOT use inputs.lookback_days directly, only use SANITIZED_TIMESTAMP. | ||
SANITIZED_LOOKBACK_DAYS="$(echo '${{ inputs.lookback_days }}' | grep -oE '^[0-9]+$')" | ||
if [ -z "$SANITIZED_LOOKBACK_DAYS" ]; then | ||
echo "Please ensure inputs.lookback_days is a number." | ||
exit 1 | ||
fi | ||
SANITIZED_TIMESTAMP="$(date -d "$SANITIZED_LOOKBACK_DAYS days ago" +%Y%m%d_%H%M%S)" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd suggest to apply grep filter on 'date' output with following -z check as well (same approach as for $SANITIZED_LOOKBACK_DAYS) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @offsake Mind if I ask your thinking behind this? Are you trying to make sure the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've gone ahead and made the changes anyway, but an explanation would still be appreciated; thanks! |
||
if [ -z "$(echo "$SANITIZED_TIMESTAMP" | grep -oE '^[0-9]{8}_[0-9]{6}$' )" ]; then | ||
echo "Invalid timestamp generated: is inputs.lookback_days valid?" | ||
exit 1 | ||
fi | ||
echo "SANITIZED_TIMESTAMP=$SANITIZED_TIMESTAMP" >> $GITHUB_ENV | ||
- name: Load benchmarking configuration | ||
shell: bash | ||
run: | | ||
$(python ./devops/scripts/benchmarking/load_config.py ./devops constants) | ||
echo "SANITIZED_PERF_RES_GIT_REPO=$SANITIZED_PERF_RES_GIT_REPO" >> $GITHUB_ENV | ||
ianayl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
echo "SANITIZED_PERF_RES_GIT_BRANCH=$SANITIZED_PERF_RES_GIT_BRANCH" >> $GITHUB_ENV | ||
- name: Checkout historical performance results repository | ||
shell: bash | ||
run: | | ||
if [ ! -d ./llvm-ci-perf-results ]; then | ||
git clone -b "$SANITIZED_PERF_RES_GIT_BRANCH" "https://github.com/$SANITIZED_PERF_RES_GIT_REPO" ./llvm-ci-perf-results | ||
fi | ||
- name: Run aggregator on historical results | ||
shell: bash | ||
run: | | ||
# The current format of the historical results respository is: | ||
# | ||
# /<ONEAPI_DEVICE_SELECTOR>/<runner>/<test name> | ||
# | ||
# Thus, a min/max depth of 3 is used to enumerate all test cases in the | ||
# repository. Test name is also derived from here. | ||
find ./llvm-ci-perf-results -mindepth 3 -maxdepth 3 -type d ! -path '*.git*' | | ||
while read -r dir; do | ||
test_name="$(basename "$dir")" | ||
python ./devops/scripts/benchmarking/aggregate.py ./devops "$test_name" "$dir" "$SANITIZED_TIMESTAMP" | ||
ianayl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
done | ||
- name: Upload average to the repo | ||
shell: bash | ||
run: | | ||
cd ./llvm-ci-perf-results | ||
git config user.name "SYCL Benchmarking Bot" | ||
git config user.email "[email protected]" | ||
git pull | ||
# Make sure changes have been made | ||
if git diff --quiet && git diff --cached --quiet; then | ||
echo "No changes to median, skipping push." | ||
else | ||
git add . | ||
git commit -m "[GHA] Aggregate median data from $SANITIZED_TIMESTAMP to $(date +%Y%m%d_%H%M%S)" | ||
git push "https://[email protected]/$SANITIZED_PERF_RES_GIT_REPO.git" "$SANITIZED_PERF_RES_GIT_BRANCH" | ||
fi | ||
- name: Find aggregated average results artifact here | ||
if: always() | ||
shell: bash | ||
run: | | ||
cat << EOF | ||
# | ||
# Artifact link for aggregated averages here: | ||
# | ||
EOF | ||
- name: Archive new medians | ||
if: always() | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: llvm-ci-perf-results new medians | ||
path: ./llvm-ci-perf-results/**/*-median.csv |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
name: 'Run compute-benchmarks' | ||
|
||
# Run compute-benchmarks on SYCL | ||
# | ||
# This action assumes SYCL is in ./toolchain, and that /devops has been | ||
# checked out in ./devops. This action also assumes that GITHUB_TOKEN | ||
# was properly set in env, because according to Github, that's apparently the | ||
# recommended way to pass a secret into a github action: | ||
# | ||
# https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions#accessing-your-secrets | ||
# | ||
# This action also expects a RUNNER_TAG environment variable to be set to the | ||
# runner tag used to run this workflow: Currently, only gen12 and pvc on Linux | ||
# are fully supported. Although this workflow won't stop you from running other | ||
# devices, note that only gen12 and pvc has been tested to work. | ||
# | ||
|
||
inputs: | ||
target_devices: | ||
type: string | ||
required: True | ||
|
||
runs: | ||
using: "composite" | ||
steps: | ||
- name: Check specified runner type / target backend | ||
shell: bash | ||
env: | ||
TARGET_DEVICE: ${{ inputs.target_devices }} | ||
run: | | ||
case "$RUNNER_TAG" in | ||
'["Linux", "gen12"]' | '["Linux", "pvc"]') ;; | ||
*) | ||
echo "#" | ||
echo "# WARNING: Only gen12/pvc on Linux is fully supported." | ||
echo "# This workflow is not guaranteed to work with other runners." | ||
echo "#" ;; | ||
esac | ||
|
||
# input.target_devices is not directly used, as this allows code injection | ||
case "$TARGET_DEVICE" in | ||
level_zero:*) ;; | ||
*) | ||
echo "#" | ||
echo "# WARNING: Only level_zero backend is fully supported." | ||
echo "# This workflow is not guaranteed to work with other backends." | ||
echo "#" ;; | ||
esac | ||
- name: Run compute-benchmarks | ||
shell: bash | ||
run: | | ||
cat << EOF | ||
# | ||
# NOTE TO DEVELOPERS: | ||
# | ||
|
||
Check latter steps of the workflow: This job produces an artifact with: | ||
- benchmark results from passing/failing tests | ||
- log containing all failing (too slow) benchmarks | ||
- log containing all erroring benchmarks | ||
|
||
While this step in the workflow provides debugging output describing this | ||
information, it might be easier to inspect the logs from the artifact | ||
instead. | ||
|
||
EOF | ||
export ONEAPI_DEVICE_SELECTOR="${{ inputs.target_devices }}" | ||
export CMPLR_ROOT=./toolchain | ||
echo "-----" | ||
sycl-ls | ||
echo "-----" | ||
./devops/scripts/benchmarking/benchmark.sh -n '${{ runner.name }}' -s || exit 1 | ||
- name: Push compute-benchmarks results | ||
if: always() | ||
shell: bash | ||
run: | | ||
# TODO -- waiting on security clearance | ||
# Load configuration values | ||
$(python ./devops/scripts/benchmarking/load_config.py ./devops constants) | ||
|
||
cd "./llvm-ci-perf-results" | ||
git config user.name "SYCL Benchmarking Bot" | ||
git config user.email "[email protected]" | ||
git pull | ||
git add . | ||
# Make sure changes have been made | ||
if git diff --quiet && git diff --cached --quiet; then | ||
echo "No new results added, skipping push." | ||
else | ||
git commit -m "[GHA] Upload compute-benchmarks results from https://github.com/intel/llvm/actions/runs/${{ github.run_id }}" | ||
git push "https://[email protected]/$SANITIZED_PERF_RES_GIT_REPO.git" "$SANITIZED_PERF_RES_GIT_BRANCH" | ||
fi | ||
- name: Find benchmark result artifact here | ||
if: always() | ||
shell: bash | ||
run: | | ||
cat << EOF | ||
# | ||
# Artifact link for benchmark results here: | ||
# | ||
EOF | ||
- name: Archive compute-benchmark results | ||
if: always() | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: Compute-benchmark run ${{ github.run_id }} (${{ runner.name }}) | ||
path: ./artifact |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
; | ||
; This file contains configuration options to change the behaviour of the | ||
; benchmarking workflow in sycl-linux-run-tests.yml. | ||
; | ||
; DO NOT USE THE CONTENTS OF THIS FILE DIRECTLY -- Due to security concerns, The | ||
; contents of this file must be sanitized first before use. | ||
; See: /devops/scripts/benchmarking/common.py | ||
; | ||
|
||
; Compute-benchmark compile/run options | ||
[compute_bench] | ||
; Value for -j during compilation of compute-benchmarks | ||
compile_jobs = 2 | ||
; Number of iterations to run compute-benchmark tests | ||
iterations = 100 | ||
|
||
; Options for benchmark result metrics (to record/compare against) | ||
[metrics] | ||
; Sets the metrics to record/aggregate in the historical average. | ||
; Format: comma-separated list of column names in compute-benchmark results | ||
recorded = Median,StdDev | ||
; Sets the tolerance for each recorded metric and their allowed deviation from | ||
; the historical average. Metrics not included here are not compared against | ||
; when passing/failing benchmark results. | ||
; Format: comma-separated list of <metric>:<deviation percentage in decimals> | ||
tolerances = Median:0.5 | ||
|
||
; Options for computing historical averages | ||
[average] | ||
; Number of days (from today) to look back for results when computing historical | ||
; average | ||
cutoff_range = 7 | ||
; Minimum number of samples required to compute a historical average | ||
min_threshold = 3 | ||
|
||
; ONEAPI_DEVICE_SELECTOR linting/options | ||
[device_selector] | ||
; Backends to allow in device_selector | ||
enabled_backends = level_zero,opencl,cuda,hip | ||
; native_cpu is disabled | ||
|
||
; Devices to allow in device_selector | ||
enabled_devices = cpu,gpu | ||
; fpga is disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianayl hi, likely there should also be
github.repository == 'intel/llvm' && ...
to avoid running this in forks. @intel/dpcpp-devops-reviewers FYIThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw: fix for this is in #17122, I will merge after I do some final testing with my tuning here