-
Notifications
You must be signed in to change notification settings - Fork 6.5k
fix(django): retry-based e2e testing #5340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(django): retry-based e2e testing #5340
Conversation
No region tags are edited in this PR.This comment is generated by snippet-bot.
|
permissions do not currently permit cloudbuild to create sqlusers
run/django/test/retry.sh
Outdated
if [ $? -eq 0 ]; then | ||
echo "running: $2" | ||
$($2 > /dev/null) | ||
else return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@averikitsch Updated line from nodejs-docs-samples version: If running in a two-arg state, if the first command failed, this failure wouldn't be captured by the until loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original assumption here is if the first command fails, we don't want to retry it since it's most likely a "describe" command. I would assume that "describe" commands wouldn't fail or be flaky like delete or deploy commands. The goals is aimed at capturing the failure of the second command. I'm open to changing that logic or adding better comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using the script in an "create/describe" format, and without this change 409 errors were falling through. Adding this change would extend it's usage. The script comments describe usage for "action" or "exists/action", so I was using it incorrectly. I will address.
run/django/test/retry.sh
Outdated
if ((attempt_num==max_attempts)) | ||
then | ||
echo "Attempt $attempt_num / $max_attempts failed! No more retries left!" | ||
exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@averikitsch Update from the nodejs-docs-sample version (opinionated): should the retry script fail entirely, it should return an error state, meaning the cloud build step should fail, thus the job fails, and the test should fail.
This should allow for setup failures to fail early, rather than steps failing later due to the earlier failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. The exit 1
should cause the script and then the build step to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with this change, the step will only fail if the last exit status is 1, so in theory if you use multiple retry.sh in one step, there is a chance it can fail and continue, depending on the order and success of commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Would this have to be handled in the cloud build to exit sooner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the example I borrowed from, the cleanup was all done in one step. If and only if the last command failed in the step would the step fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I backed out these changes and updated my use of the script to correct the issues I was experiencing.
Description
Addresses #5307 #5252
Uses retry for setup/cleanup of infra, should help with flaky.
Some infra commands outside of retry.sh because permissions (would require cloud sql admin)
Checklist
nox -s py-3.6
(see Test Environment Setup)nox -s lint
(see Test Environment Setup)