-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Increase Azure client timeout on tests #67210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase Azure client timeout on tests #67210
Conversation
Instead of executing all the delete request in parallel this commits introduces a change that allows the execution of delete requests in batches of 100 parallel deletions. The reason for this change is to avoid timeout failures when large number of files should be deleted as if we execute all of them in parallel a few slow requests could make the rest to fail due to timeouts, as there is an effective limit at the connection pool level. Additionally, this commit improves the error messages provided as previously we weren't including the blob name on deletion failures. Closes elastic#67119
Pinging @elastic/es-distributed (Team:Distributed) |
@original-brownbear would you mind to take a look into this when you have the time? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Francisco this makes sense I think.
My understanding is that the problem we're facing is a client side request timeout because the requests have to wait for so long to actually go out over the limited connections we have.
I wonder if we couldn't just way increase the timeout to work around this instead?
Would keep the code simpler for one but also would overall run faster I guess since we get more parallelism from deleting.
The risk of every now and then failing a bulk delete because of a bunch of slow running deletes isn't so bad IMO, all our repo operations will clean the left-overs up during subsequent delete operations anyway. In the real world it seems very unlikely we'd be seeing what failed the test here anyway since the test timeouts are so absurdly short.
Also, in practice we don't even have a timeout by default anyway for Azure do we so this is a test-only issue for the most part it seems?
Thanks for the review Armin!
Yes, that should solve the issue. I was a bit worried about the consequences of a bunch of failed delete I'll increase the timeout and keep the improvements around the error messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks Francisco!
Additionally, this commit improves the error messages provided as previously we weren't including the blob name on deletion failures. Closes elastic#67119 Backport of elastic#67210
Additionally, this commit improves the error messages provided as previously we weren't including the blob name on deletion failures. Closes elastic#67119 Backport of elastic#67210
Additionally, this commit improves the error messages provided as
previously we weren't including the blob name on deletion failures.
Closes #67119