-
Notifications
You must be signed in to change notification settings - Fork 370
Speedup jsonnet generation by running in parallel #1908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup jsonnet generation by running in parallel #1908
Conversation
sthaha
commented
Mar 2, 2023
- I added CHANGELOG entry for this change.
- No user facing changes, so no entry in CHANGELOG was needed.
Do you have some performance numbers on the speedup you're seeing? I did a
This PR:
|
I don't think
If you checkout the pr and run this locally, the difference is quite perceivable (at least one my 8 core linux box). May be try to look at only the total time
On my machine, I see that this PR makes it at least 3 times faster |
I still get very similar results with
PR:
|
I think this could be the issue, that
I think a better way is to time the script itself. Could you please try this? diff --git a/Makefile b/Makefile
index b77423bc..2cf60da1 100644
--- a/Makefile
+++ b/Makefile
@@ -130,7 +130,7 @@ $(ASSETS): build-jsonnet
.PHONY: build-jsonnet
build-jsonnet: $(JSONNET_BIN) $(GOJSONTOYAML_BIN) $(JSONNET_SRC) $(JSONNET_VENDOR) json-manifests json-crds
- ./hack/build-jsonnet.sh
+ time ./hack/build-jsonnet.sh
$(JSON_MANIFESTS): $(MANIFESTS)
cat $(MANIFESTS_DIR)/$(patsubst %.json,%.yaml,$(@F)) | $(GOJSONTOYAML_BIN) -yamltojson > $@ |
Yeah that looks more promising:
PR:
|
hack/build-jsonnet.sh
Outdated
for file in "${files[@]}"; do | ||
( | ||
dir=$(dirname "${file}") | ||
path="${prefix}/${dir}" | ||
mkdir -p "${path}" | ||
|
||
# convert file name from camelCase to snake-case | ||
fullfile=$(echo "${file}" | sed 's/\(.\)\([A-Z]\)/\1-\2/g' | tr '[:upper:]' '[:lower:]') | ||
jq -r ".[\"${file}\"]" "${TMP}/main.json" | gojsontoyaml > "${prefix}/${fullfile}.yaml" | ||
)& |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for file in "${files[@]}"; do | |
( | |
dir=$(dirname "${file}") | |
path="${prefix}/${dir}" | |
mkdir -p "${path}" | |
# convert file name from camelCase to snake-case | |
fullfile=$(echo "${file}" | sed 's/\(.\)\([A-Z]\)/\1-\2/g' | tr '[:upper:]' '[:lower:]') | |
jq -r ".[\"${file}\"]" "${TMP}/main.json" | gojsontoyaml > "${prefix}/${fullfile}.yaml" | |
)& | |
for file in "${files[@]}"; do | |
dir=$(dirname "${file}") | |
path="${prefix}/${dir}" | |
mkdir -p "${path}" | |
# convert file name from camelCase to snake-case | |
fullfile=$(echo "${file}" | sed 's/\(.\)\([A-Z]\)/\1-\2/g' | tr '[:upper:]' '[:lower:]') | |
jq -r ".[\"${file}\"]" "${TMP}/main.json" | gojsontoyaml > "${prefix}/${fullfile}.yaml" & |
I get slightly better results with this change (shave a few seconds off the results fairly consistently). Can you give this a try as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my first version :) but I felt starting a subshell at the start of for loop makes it easier to reason with. Since it boils down to
for f in files {
( process this file) & // in background
wait if there are more than N processes
}
I think, we can achieve the same without starting a subshell by using a block and run that in background.
3305ebf
to
03f4e79
Compare
Previously, jsonnet to yaml conversion was done sequentially and this took quite some time. The patch speeds that up by running them in parallel . The number of jobs to run in parallel is determined by the number of CPU cores (thus requires nproc to be in path). Signed-off-by: Sunil Thaha <[email protected]>
03f4e79
to
7eb4068
Compare
/retest required |
@sthaha: The
The following commands are available to trigger optional jobs:
Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest-required |
# convert file name from camelCase to snake-case | ||
fullfile=$(echo "${file}" | sed 's/\(.\)\([A-Z]\)/\1-\2/g' | tr '[:upper:]' '[:lower:]') | ||
jq -r ".[\"${file}\"]" "${TMP}/main.json" | gojsontoyaml > "${prefix}/${fullfile}.yaml" | ||
}& |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jan--f this should give us the same perf as jq -r ".[\"${file}\"]" "${TMP}/main.json" | gojsontoyaml > "${prefix}/${fullfile}.yaml" &
with the additional benefit of easier to reason with. I.E. consider the entire block as a function.
/retest-required |
6 similar comments
/retest-required |
/retest-required |
/retest-required |
/retest-required |
/retest-required |
/retest-required |
@jan--f now that the CI gods have blessed this, can merge this one please? |
yes, sorry for the lag! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jan--f, sthaha The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sthaha: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |