Skip to content

Commit 37de823

Browse files
committed
Improve test runner performance by running from CRaC checkpoint
The test runner suffers from the slow startup time of the JVM. My goal with this PR was to significantly improve the performance of the Java test runner. There are at the time of writing several options to improve JVM startup time: 1. Graal native image (AOT) 2. Shared AppCDS 3. Project Leyden 4. CRaC Native image can't be used for the test runner as it has to dynamically load classes at runtime and Graal AOT depends on a closed world assumption. Shared AppCDS improves performance. In my testing a test run did go down from ~4s to ~3s. Project Leyden is very promising, as it tries to do as much work as possible ahead of time without making a closed world assumption. At the time of this writing the project is only in early access, so it's probably going to take a while before it lands in a LTS release. CRaC brings the best performance improvements, but with some caveats: - build gets more complicated - the CRIU based engine needs to run with two additional capabilities: - CHECKPOINT_RESTORE - SYS_PTRACE - performance speedup depends on how well the JVM gets warmed up before the checkpoint is taken bin/run-tests-in-docker.sh had to be adjusted to start a new container for each test. The restored JVM needs to be run as a specific PID, so it can only be restored once per container life cycle. The test run still finishes faster than before. This commit uses the CRIU engine as I had some issues getting the warp engine to work properly. The warp engine is also only supported by Azul right now and isn't compatible with musl / Alpine yet. In my tests the runtime of my example test did go down from ~4s to >1s. By switching to an Alpine based image this change also reduces to size of the container (exercism/java-test-runner-crac-checkpoint) to 271MB, down from previously 464MB. CRaC documentation: - https://crac.org/ - https://docs.azul.com/core/crac/crac-introduction - https://openjdk.org/projects/crac/
1 parent 67629f0 commit 37de823

11 files changed

+218
-26
lines changed

.dockerignore

-3
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,12 @@
11
.appends
22
.git
33
.github
4-
.gradle
54
.idea
6-
gradle
75
tests
86
.dockerignore
97
.gitattributes
108
.gitignore
119
bin/run-in-docker.sh
1210
bin/run-tests.sh
1311
bin/run-tests-in-docker.sh
14-
gradlew
1512
gradlew.bat

Dockerfile

+7-6
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
1-
FROM gradle:8.12-jdk21 AS build
1+
FROM bellsoft/liberica-runtime-container:jdk-21-crac-musl AS build
22

33
WORKDIR /app
4-
COPY --chown=gradle:gradle . /app
5-
RUN gradle -i --stacktrace clean build
4+
COPY . /app
5+
RUN /app/gradlew -i --stacktrace clean build
66

7-
FROM eclipse-temurin:21
7+
FROM bellsoft/liberica-runtime-container:jdk-21-crac-musl
88

99
WORKDIR /opt/test-runner
10-
COPY bin/run.sh bin/run.sh
10+
COPY bin/run-to-create-crac-checkpoint.sh bin/run-to-create-crac-checkpoint.sh
11+
COPY bin/run-restore-from-checkpoint.sh bin/run-restore-from-checkpoint.sh
1112
COPY --from=build /app/build/libs/java-test-runner.jar .
1213

13-
ENTRYPOINT ["sh", "/opt/test-runner/bin/run.sh"]
14+
ENTRYPOINT ["sh", "/opt/test-runner/bin/run-to-create-crac-checkpoint.sh"]

bin/build-crac-checkpoint-image.sh

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
#!/usr/bin/env sh
2+
3+
# Synopsis:
4+
# Build a Docker image containing a CraC checkpoint to restart from.
5+
# An initial image is built. A container is created from that image
6+
# and tests are run to warm up the JVM. Then a checkpoint is created.
7+
# The final image is created by committing the containiner
8+
# containing the checkpoint.
9+
10+
docker build -t exercism/java-test-runner-crac-checkpoint .
11+
12+
# Copy all tests into one merged project, so we can warm up the JVM
13+
# TODO(FAP): this is missing some tests as most tests use the same filenames
14+
mkdir -p tests/merged
15+
for dir in tests/*; do
16+
if [ -d "$dir" ] && [ "$dir" != "tests/merged/" ]; then
17+
rsync -a "$dir"/ tests/merged/
18+
fi
19+
done
20+
21+
slug="merged"
22+
solution_dir=$(realpath "tests/merged/")
23+
output_dir=$(realpath "tests/merged/")
24+
25+
docker run --cap-add CHECKPOINT_RESTORE \
26+
--cap-add SYS_PTRACE \
27+
--name java-test-runner-crac \
28+
--network none \
29+
--mount type=bind,src="${solution_dir}",dst=/solution \
30+
--mount type=bind,src="${output_dir}",dst=/output \
31+
--mount type=tmpfs,dst=/tmp \
32+
--tmpfs /openjfx:exec,rw \
33+
exercism/java-test-runner-crac-checkpoint "${slug}" /solution /output
34+
35+
docker commit --change='ENTRYPOINT ["sh", "/opt/test-runner/bin/run-restore-from-checkpoint.sh"]' java-test-runner-crac exercism/java-test-runner-crac-restore
36+
37+
docker rm -f java-test-runner-crac
38+
rm -rf tests/merged/

bin/run-in-docker-without-build.sh

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
#!/usr/bin/env sh
2+
3+
# Synopsis:
4+
# Run the test runner on a solution using the test runner Docker image.
5+
# The container image is assumed to be already available.
6+
7+
# Arguments:
8+
# $1: exercise slug
9+
# $2: path to solution folder
10+
# $3: path to output directory
11+
12+
# Output:
13+
# Writes the test results to a results.json file in the passed-in output directory.
14+
# The test results are formatted according to the specifications at https://github.com/exercism/docs/blob/main/building/tooling/test-runners/interface.md
15+
16+
# Example:
17+
# ./bin/run-in-docker.sh two-fer path/to/solution/folder/ path/to/output/directory/
18+
19+
# If any required arguments is missing, print the usage and exit
20+
if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then
21+
echo "usage: ./bin/run-in-docker.sh exercise-slug path/to/solution/folder/ path/to/output/directory/"
22+
exit 1
23+
fi
24+
25+
slug="$1"
26+
solution_dir=$(realpath "${2%/}")
27+
output_dir=$(realpath "${3%/}")
28+
29+
# Create the output directory if it doesn't exist
30+
mkdir -p "${output_dir}"
31+
32+
# Run the Docker image using the settings mimicking the production environment
33+
docker run \
34+
--rm \
35+
--network none \
36+
--read-only \
37+
--mount type=bind,src="${solution_dir}",dst=/solution \
38+
--mount type=bind,src="${output_dir}",dst=/output \
39+
--mount type=tmpfs,dst=/tmp \
40+
exercism/java-test-runner-crac-restore "${slug}" /solution /output

bin/run-in-docker.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ output_dir=$(realpath "${3%/}")
3030
mkdir -p "${output_dir}"
3131

3232
# Build the Docker image
33-
docker build --rm -t exercism/java-test-runner .
33+
bin/build-crac-checkpoint-image.sh
3434

3535
# Run the Docker image using the settings mimicking the production environment
3636
docker run \
@@ -40,4 +40,4 @@ docker run \
4040
--mount type=bind,src="${solution_dir}",dst=/solution \
4141
--mount type=bind,src="${output_dir}",dst=/output \
4242
--mount type=tmpfs,dst=/tmp \
43-
exercism/java-test-runner "${slug}" /solution /output
43+
exercism/java-test-runner-crac-restore "${slug}" /solution /output

bin/run-restore-from-checkpoint.sh

+41
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
#!/usr/bin/env sh
2+
3+
# Synopsis:
4+
# Run the test runner on a solution using the test runner Docker image.
5+
# The test runner Docker image is built automatically.
6+
7+
# Arguments:
8+
# $1: exercise slug
9+
# $2: path to solution folder
10+
# $3: path to output directory
11+
12+
# Output:
13+
# Writes the test results to a results.json file in the passed-in output directory.
14+
# The test results are formatted according to the specifications at https://github.com/exercism/docs/blob/main/building/tooling/test-runners/interface.md
15+
16+
# Example:
17+
# ./bin/run-restore-from-checkpoint.sh two-fer path/to/solution/folder/ path/to/output/directory/
18+
19+
if [ $# -lt 3 ]
20+
then
21+
echo "Usage:"
22+
echo "./bin/run-restore-from-checkpoint.sh two-fer ~/input/ ~/output/"
23+
exit 1
24+
fi
25+
26+
problem_slug="$1"
27+
input_folder="$2"
28+
output_folder="$3"
29+
tmp_folder="/tmp/solution"
30+
31+
mkdir -p $output_folder
32+
33+
rm -rf $tmp_folder
34+
mkdir -p $tmp_folder
35+
36+
cd $tmp_folder
37+
cp -R $input_folder/* .
38+
39+
find . -mindepth 1 -type f | grep 'Test.java' | xargs -I file sed -i "s/@Ignore(.*)//g;s/@Ignore//g;s/@Disabled(.*)//g;s/@Disabled//g;" file
40+
41+
java -XX:CRaCRestoreFrom=/opt/test-runner/crac-checkpoint com.exercism.Restore $problem_slug . $output_folder

bin/run-tests-in-docker.sh

+25-13
Original file line numberDiff line numberDiff line change
@@ -13,16 +13,28 @@
1313
# ./bin/run-tests-in-docker.sh
1414

1515
# Build the Docker image
16-
docker build --rm -t exercism/java-test-runner .
17-
18-
# Run the Docker image using the settings mimicking the production environment
19-
docker run \
20-
--rm \
21-
--network none \
22-
--read-only \
23-
--mount type=bind,src="${PWD}/tests",dst=/opt/test-runner/tests \
24-
--mount type=tmpfs,dst=/tmp \
25-
--volume "${PWD}/bin/run-tests.sh:/opt/test-runner/bin/run-tests.sh" \
26-
--workdir /opt/test-runner \
27-
--entrypoint /opt/test-runner/bin/run-tests.sh \
28-
exercism/java-test-runner
16+
bin/build-crac-checkpoint-image.sh
17+
18+
exit_code=0
19+
20+
# Iterate over all test directories
21+
for test_dir in tests/*; do
22+
test_dir_name=$(basename "${test_dir}")
23+
test_dir_path=$(realpath "${test_dir}")
24+
results_file_path="${test_dir_path}/results.json"
25+
expected_results_file_path="${test_dir_path}/expected_results.json"
26+
27+
bin/run-in-docker-without-build.sh "${test_dir_name}" "${test_dir_path}" "${test_dir_path}"
28+
29+
# Normalize the results file
30+
sed -i "s~${test_dir_path}~/solution~g" "${results_file_path}"
31+
32+
echo "${test_dir_name}: comparing results.json to expected_results.json"
33+
diff "${results_file_path}" "${expected_results_file_path}"
34+
35+
if [ $? -ne 0 ]; then
36+
exit_code=1
37+
fi
38+
done
39+
40+
exit ${exit_code}

bin/run-to-create-crac-checkpoint.sh

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/usr/bin/env bash
2+
3+
# Synopsis:
4+
# Run the test runner on a solution using the test runner Docker image.
5+
# The test runner Docker image is built automatically.
6+
7+
# Arguments:
8+
# $1: exercise slug
9+
# $2: path to solution folder
10+
# $3: path to output directory
11+
12+
# Output:
13+
# Writes the test results to a results.json file in the passed-in output directory.
14+
# The test results are formatted according to the specifications at https://github.com/exercism/docs/blob/main/building/tooling/test-runners/interface.md
15+
16+
# Example:
17+
# ./bin/run-create-crac-checkpoint.sh two-fer path/to/solution/folder/ path/to/output/directory/
18+
19+
if [ $# -lt 3 ]
20+
then
21+
echo "Usage:"
22+
echo "./bin/run-create-crac-checkpoint.sh two-fer ~/input/ ~/output/"
23+
exit 1
24+
fi
25+
26+
problem_slug="$1"
27+
input_folder="$2"
28+
output_folder="$3"
29+
tmp_folder="/tmp/solution"
30+
31+
mkdir -p $output_folder
32+
33+
rm -rf $tmp_folder
34+
mkdir -p $tmp_folder
35+
36+
cd $tmp_folder
37+
cp -R $input_folder/* .
38+
39+
find . -mindepth 1 -type f | grep 'Test.java' | xargs -I file sed -i "s/@Ignore(.*)//g;s/@Ignore//g;s/@Disabled(.*)//g;s/@Disabled//g;" file
40+
41+
# -XX:-UsePerfData option worked outside of Docker, but inside of Docker the restore would fail
42+
# See https://docs.azul.com/core/crac/crac-debugging#restore-conflict-of-pids and https://docs.azul.com/core/crac/crac-debugging#using-cracminpid-option
43+
# for info about -XX:CRaCMinPid
44+
java -XX:CRaCMinPid=128 -XX:CRaCCheckpointTo=/opt/test-runner/crac-checkpoint -Xshare:off -jar /opt/test-runner/java-test-runner.jar $problem_slug . $output_folder

build.gradle

+2
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ dependencies {
2424
implementation "com.fasterxml.jackson.datatype:jackson-datatype-jdk8:$jacksonVersion"
2525
implementation 'com.github.javaparser:javaparser-core:3.26.2'
2626

27+
implementation 'org.crac:crac:1.5.0'
28+
2729
implementation 'org.assertj:assertj-core:3.25.3'
2830
implementation 'org.apiguardian:apiguardian-api:1.1.2' // https://github.com/exercism/java-test-runner/issues/79
2931
implementation platform('org.junit:junit-bom:5.11.3')
+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
package com.exercism;
2+
3+
import java.io.IOException;
4+
5+
public class Restore {
6+
7+
public static void main(String[] args) throws IOException {
8+
new TestRunner(args[0], args[1], args[2]).run();
9+
}
10+
11+
}

src/main/java/com/exercism/TestRunner.java

+8-2
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@
1717
import java.util.List;
1818
import java.util.stream.Stream;
1919

20+
import org.crac.CheckpointException;
21+
import org.crac.Core;
22+
import org.crac.RestoreException;
23+
2024
public final class TestRunner {
2125

2226
private final JUnitTestParser testParser;
@@ -33,14 +37,16 @@ public TestRunner(String slug, String inputDirectory, String outputDirectory) {
3337
this.inputDirectory = inputDirectory;
3438
}
3539

36-
public static void main(String[] args) throws IOException {
40+
public static void main(String[] args) throws IOException, CheckpointException, RestoreException {
3741
if (args.length < 3) {
3842
throw new IllegalArgumentException("Not enough arguments, need <slug> <inputDirectory> <outputDirectory>");
3943
}
4044
new TestRunner(args[0], args[1], args[2]).run();
45+
46+
Core.checkpointRestore();
4147
}
4248

43-
private void run() throws IOException {
49+
void run() throws IOException {
4450
var sourceFiles = resolveSourceFiles();
4551
var testFiles = resolveTestFiles();
4652

0 commit comments

Comments
 (0)