Skip to content

Ensure Test Cluster in Azure Discovery Tests Closes #43057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 12, 2019

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Jun 10, 2019

  • Use internalCluster().close() to force all nodes (and not just the datanodes) to shut down even if one fails to shut down in time
  • More than 10s timeout on node shutdown, if we have a slow run 10s might not suffice in an IT
  • Force closing httpServer to get cleaner logs if nodes still hang on shut down
  • Relates [CI] AzureDiscoveryClusterFormationTests leaks threads #43048

* Use `internalCluster().close()` to force all nodes (and not just the datanodes) to shut down even if one fails to shut down in time
* More than 10s timeout on node shutdown, if we have a slow run 10s might not suffice in an IT
* Force closing httpServer to get cleaner logs if nodes still hang on shut down
* Closes elastic#43048
@original-brownbear original-brownbear added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure v8.0.0 v7.3.0 labels Jun 10, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/2

@original-brownbear
Copy link
Member Author

original-brownbear commented Jun 11, 2019

It appears the increased timeout somehow breaks an ML test?? I'm looking into that.

Update: yea the ML tests are in fact dependent on the node shutdown running into the interrupt mentioned below within 10s, so I can't just up that here without messing with those tests.

@original-brownbear
Copy link
Member Author

@henningandersen I looked into the code some more and I'm not so convinced that upping the timeout here will help (closing a node waits for 10s then interrupts the node's threadpool and waits another 10s so we shouldn't hang here really). I'd still merge this if it's ok with you just to get a clean stack trace for where exactly things are stuck (currently, we're not stopping all the nodes and neither do we stop the web server on failure so it's hard to tell).
WDYT?

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's get this merged

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@original-brownbear original-brownbear merged commit 38d1caf into elastic:master Jun 12, 2019
@original-brownbear original-brownbear deleted the 43048 branch June 12, 2019 08:43
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jun 19, 2019
* Use `internalCluster().close()` to force all nodes (and not just the datanodes) to shut down even if one fails to shut down in time
* Force closing httpServer to get cleaner logs if nodes still hang on shut down
* Relates elastic#43048
original-brownbear added a commit that referenced this pull request Jun 19, 2019
* Use `internalCluster().close()` to force all nodes (and not just the datanodes) to shut down even if one fails to shut down in time
* Force closing httpServer to get cleaner logs if nodes still hang on shut down
* Relates #43048
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure >test Issues or PRs that are addressing/adding tests v7.3.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants