Skip to content

[7.x] [ML] adding ml autoscaling integration test (#65638) #65775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 3, 2020

Conversation

benwtrent
Copy link
Member

Backports the following commits to 7.x:

This adds ml autoscaling integration tests.

The test verifies that the scaling requirements adjust according to the current real load
on the cluster given machine learning jobs of various sizes.

Additionally, there was a bug in the ml scaling service settings. This commit addresses the bug.
@benwtrent benwtrent added :ml Machine learning backport labels Dec 2, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent benwtrent merged commit 4d8bba3 into elastic:7.x Dec 3, 2020
@benwtrent benwtrent deleted the backport/7.x/pr-65638 branch December 3, 2020 14:20
Settings.builder().put(MlAutoscalingDeciderService.DOWN_SCALE_DELAY.getKey(), TimeValue.ZERO).build());
final PutAutoscalingPolicyAction.Request request = new PutAutoscalingPolicyAction.Request(
"ml_test",
new TreeSet<>(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent do you know why this did not work? It is preventing me from getting #66082 merged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henningandersen when I tried to supply "ml" it complained saying it was not really a valid node role. Now, this may have been due to the ML plugin not being loaded correctly? The easiest way to check is to simply put ml in there in 7.x and ry to run the test. I will double check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did that and get the validation error. I only wanted to be sure I was not chasing something you already investigated deeply.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henningandersen

 REPRODUCE WITH: ./gradlew ':x-pack:plugin:ml:qa:native-multi-node-tests:javaRestTest' --tests "org.elasticsearch.xpack.ml.integration.AutoscalingIT.testMLAutoscalingCapacity" -Dtests.seed=93A328B876AF5108 -Dtests.security.manager=true -Dtests.locale=pl -Dtests.timezone=Atlantic/Faroe -Druntime.java=11
  2> org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: ml;
        at __randomizedtesting.SeedInfo.seed([93A328B876AF5108:D856B3907EB37506]:0)
        at org.elasticsearch.xpack.autoscaling.action.PutAutoscalingPolicyAction$Request.validate(PutAutoscalingPolicyAction.java:150)
        at org.elasticsearch.action.TransportActionNodeProxy.execute(TransportActionNodeProxy.java:42)
        at org.elasticsearch.client.transport.TransportProxyClient.lambda$execute$0(TransportProxyClient.java:55)
        at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:253)
        at org.elasticsearch.client.transport.TransportProxyClient.execute(TransportProxyClient.java:55)
        at org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:391)
        at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:412)
        at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:65)
        at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:412)
        at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:401)
        at org.elasticsearch.xpack.ml.integration.AutoscalingIT.testMLAutoscalingCapacity(AutoscalingIT.java:57)

I THINK the node role format has changed in master vs 7.x. Passing ml as the only role works fine in master.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhh, the validation runs in the transport client and this causes the issue, since the client does not know about the ml plugin. Using a core role like master passes that validation (but fails the test with my PR due to validation).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me open a PR to move the validation out of the request in 7.x.

henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Dec 9, 2020
While transport client is not supported for autoscaling in 7.x, some
tests rely on it and this commit ensures that the validation of roles
happen server side and not client side.

Relates elastic#65775 and elastic#66082
henningandersen added a commit that referenced this pull request Dec 9, 2020
While transport client is not supported for autoscaling in 7.x, some
tests rely on it and this commit ensures that the validation of roles
happen server side and not client side.

Relates #65775 and #66082
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :ml Machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants