Skip to content

Wait for ML templates after creating a new cluster in TooManyJobsIT #54793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 6, 2020

Conversation

davidkyle
Copy link
Member

TooManyJobsIT.testCloseFailedJob stops all the data nodes in the cluster and restarts them again, in effect this is a brand new cluster and the test should wait for the ml templates to be installed before continuing.

This handles 2 of the failures listed in #54162 where indices in the new cluster are created without a template but there is still one failure where the test timed out waiting for the templates in the original cluster. This may just be due to a slow machine I will unmute the test in the backport to 7.x and watch for new failures.

As to why the tests staring failing now and only on 7.x I am not sure, possibly something in #51765. The best way forward is to unmute the test and see if it happens again

@davidkyle davidkyle added >test Issues or PRs that are addressing/adding tests :ml Machine learning v8.0.0 v7.8.0 labels Apr 6, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

There's scope for changing the way we start these clusters more radically. The comment // clear all nodes, so that we can set xpack.ml.max_open_jobs setting around line 202/203 is wrong now - it dates back to 5.x. In 7.x xpack.ml.max_open_jobs is a dynamic cluster-wide setting, so we could change the setting with a single settings call rather than restarting the entire cluster. But I am happy to leave that for a future PR.

@davidkyle
Copy link
Member Author

xpack.ml.max_open_jobs is a dynamic cluster-wide setting

Yes I looked at that but I can't see how to change a setting on the internal cluster. There probably is a way I just couldn't see it

@davidkyle davidkyle merged commit a92c1dd into elastic:master Apr 6, 2020
@davidkyle davidkyle deleted the really-wait-for-templates branch April 6, 2020 11:48
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test Issues or PRs that are addressing/adding tests v7.8.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants