-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[7.x] [ML] adding ml autoscaling integration test (#65638) #65775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds ml autoscaling integration tests. The test verifies that the scaling requirements adjust according to the current real load on the cluster given machine learning jobs of various sizes. Additionally, there was a bug in the ml scaling service settings. This commit addresses the bug.
Pinging @elastic/ml-core (:ml) |
Settings.builder().put(MlAutoscalingDeciderService.DOWN_SCALE_DELAY.getKey(), TimeValue.ZERO).build()); | ||
final PutAutoscalingPolicyAction.Request request = new PutAutoscalingPolicyAction.Request( | ||
"ml_test", | ||
new TreeSet<>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benwtrent do you know why this did not work? It is preventing me from getting #66082 merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henningandersen when I tried to supply "ml"
it complained saying it was not really a valid node role. Now, this may have been due to the ML plugin not being loaded correctly? The easiest way to check is to simply put ml
in there in 7.x and ry to run the test. I will double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did that and get the validation error. I only wanted to be sure I was not chasing something you already investigated deeply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REPRODUCE WITH: ./gradlew ':x-pack:plugin:ml:qa:native-multi-node-tests:javaRestTest' --tests "org.elasticsearch.xpack.ml.integration.AutoscalingIT.testMLAutoscalingCapacity" -Dtests.seed=93A328B876AF5108 -Dtests.security.manager=true -Dtests.locale=pl -Dtests.timezone=Atlantic/Faroe -Druntime.java=11
2> org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: ml;
at __randomizedtesting.SeedInfo.seed([93A328B876AF5108:D856B3907EB37506]:0)
at org.elasticsearch.xpack.autoscaling.action.PutAutoscalingPolicyAction$Request.validate(PutAutoscalingPolicyAction.java:150)
at org.elasticsearch.action.TransportActionNodeProxy.execute(TransportActionNodeProxy.java:42)
at org.elasticsearch.client.transport.TransportProxyClient.lambda$execute$0(TransportProxyClient.java:55)
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:253)
at org.elasticsearch.client.transport.TransportProxyClient.execute(TransportProxyClient.java:55)
at org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:391)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:412)
at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:65)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:412)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:401)
at org.elasticsearch.xpack.ml.integration.AutoscalingIT.testMLAutoscalingCapacity(AutoscalingIT.java:57)
I THINK the node role format has changed in master vs 7.x. Passing ml
as the only role works fine in master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh, the validation runs in the transport client and this causes the issue, since the client does not know about the ml plugin. Using a core role like master
passes that validation (but fails the test with my PR due to validation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me open a PR to move the validation out of the request in 7.x.
While transport client is not supported for autoscaling in 7.x, some tests rely on it and this commit ensures that the validation of roles happen server side and not client side. Relates elastic#65775 and elastic#66082
Backports the following commits to 7.x: