Skip to content

[ML][Data Frame] responding with 409 status code when failing _stop #44231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 12, 2019

Conversation

benwtrent
Copy link
Member

When attempting a _stop call against any number of tasks with force=false and any of the tasks are failed, we should not stop any of the tasks. The response code should indicate an error with details in the body.

closes #44103

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left a few comments I don't think you need the integration test this is mergeable without it

@@ -108,6 +134,8 @@ protected void taskOperation(StopDataFrameTransformAction.Request request, DataF
}

if (ids.contains(transformTask.getTransformId())) {
// This should not occur as we validate that none of the tasks are in a failed state earlier
// Keep this check in here for insurance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -88,8 +91,31 @@ protected void doExecute(Task task, StopDataFrameTransformAction.Request request
new PageParams(0, 10_000),
request.isAllowNoMatch(),
ActionListener.wrap(hitsAndIds -> {
if (request.isForce() == false) {
PersistentTasksCustomMetaData tasks = state.metaData().custom(PersistentTasksCustomMetaData.TYPE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is the main change. If you really want to test this code extract it to a static method and it is easily unit testable. MlTasksTests is a good reference for building PersistentTasksCustomMetaData

@benwtrent
Copy link
Member Author

run elasticsearch-ci/bwc
run elasticsearch-ci/default-distro

@benwtrent
Copy link
Member Author

@elasticmachine update branch

@@ -161,7 +161,8 @@ protected void createContinuousPivotReviewsTransform(String transformId, String

String config = "{ \"dest\": {\"index\":\"" + dataFrameIndex + "\"},"
+ " \"source\": {\"index\":\"" + REVIEWS_INDEX_NAME + "\"},"
+ " \"sync\": {\"time\":{\"field\": \"timestamp\", \"delay\": \"15m\"}},"
//Set frequency high for testing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Set frequency high" - DYM low? The default is 60s, so 1s is lower?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High frequency => low time between repeats

(Similar to high frequency => short wavelength)

But if it's confusing then maybe bypass this by changing the comment to something like // Set frequency to 1s to make the test run as fast as possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those pedantic things that annoy me; frequency is measured in Hz not seconds, this is an interval. However, most people will read this and understand that frequency represents the period here and is probably the least confusing naming option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high frequency in Hz means a short distance between cycles.

A high frequency here means a short time between attempts.

failedTasks.get(0),
failedReasons.get(0)) :
"Unable to stop data frame transforms. The following transforms are in a failed state " +
failedTasks + " with reasons " + failedReasons + ". Use force stop to stop the data frame transforms.";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use a constant

@@ -88,8 +120,9 @@ protected void doExecute(Task task, StopDataFrameTransformAction.Request request
new PageParams(0, 10_000),
request.isAllowNoMatch(),
ActionListener.wrap(hitsAndIds -> {
validateTaskState(state, hitsAndIds.v2(), request.isForce());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: taskstate -> taskstates (or is it tasksstates?)

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some very minor comments

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit 0b6676a into elastic:master Jul 12, 2019
@benwtrent benwtrent deleted the bug/ml-df-handle-failed-stops branch July 12, 2019 12:57
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jul 12, 2019
…lastic#44231)

* [ML][Data Frame] responding with appropriate status code when failing _stop

* adding null checks for persistent task data

* addressing PR comments
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jul 12, 2019
…lastic#44231)

* [ML][Data Frame] responding with appropriate status code when failing _stop

* adding null checks for persistent task data

* addressing PR comments
benwtrent added a commit that referenced this pull request Jul 12, 2019
…44231) (#44277)

* [ML][Data Frame] responding with appropriate status code when failing _stop

* adding null checks for persistent task data

* addressing PR comments
benwtrent added a commit that referenced this pull request Jul 12, 2019
…44231) (#44276)

* [ML][Data Frame] responding with appropriate status code when failing _stop

* adding null checks for persistent task data

* addressing PR comments
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Incorrect response when trying to stop a failed data frame transform without force=true
6 participants