Skip to content

fix(django): retry-based e2e testing #5340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Feb 11, 2021
Merged

fix(django): retry-based e2e testing #5340

merged 21 commits into from
Feb 11, 2021

Conversation

glasnt
Copy link
Contributor

@glasnt glasnt commented Feb 7, 2021

Description

Addresses #5307 #5252

Uses retry for setup/cleanup of infra, should help with flaky.

Some infra commands outside of retry.sh because permissions (would require cloud sql admin)

Checklist

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Feb 7, 2021
@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Feb 7, 2021
@snippet-bot
Copy link

snippet-bot bot commented Feb 8, 2021

No region tags are edited in this PR.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@glasnt glasnt changed the title DRAFT WIP - retry-based django e2e testing fix(django): retry-based e2e testing Feb 8, 2021
@glasnt glasnt marked this pull request as ready for review February 8, 2021 04:58
@glasnt glasnt requested a review from a team as a code owner February 8, 2021 04:58
if [ $? -eq 0 ]; then
echo "running: $2"
$($2 > /dev/null)
else return 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@averikitsch Updated line from nodejs-docs-samples version: If running in a two-arg state, if the first command failed, this failure wouldn't be captured by the until loop

Copy link
Contributor

@averikitsch averikitsch Feb 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original assumption here is if the first command fails, we don't want to retry it since it's most likely a "describe" command. I would assume that "describe" commands wouldn't fail or be flaky like delete or deploy commands. The goals is aimed at capturing the failure of the second command. I'm open to changing that logic or adding better comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using the script in an "create/describe" format, and without this change 409 errors were falling through. Adding this change would extend it's usage. The script comments describe usage for "action" or "exists/action", so I was using it incorrectly. I will address.

if ((attempt_num==max_attempts))
then
echo "Attempt $attempt_num / $max_attempts failed! No more retries left!"
exit 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@averikitsch Update from the nodejs-docs-sample version (opinionated): should the retry script fail entirely, it should return an error state, meaning the cloud build step should fail, thus the job fails, and the test should fail.
This should allow for setup failures to fail early, rather than steps failing later due to the earlier failure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. The exit 1 should cause the script and then the build step to fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with this change, the step will only fail if the last exit status is 1, so in theory if you use multiple retry.sh in one step, there is a chance it can fail and continue, depending on the order and success of commands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Would this have to be handled in the cloud build to exit sooner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example I borrowed from, the cleanup was all done in one step. If and only if the last command failed in the step would the step fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I backed out these changes and updated my use of the script to correct the issues I was experiencing.

@leahecole leahecole merged commit aae176a into GoogleCloudPlatform:master Feb 11, 2021
@glasnt glasnt deleted the topic/django-e2e-wip branch February 11, 2021 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement. samples Issues that are directly related to samples.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants