Skip to content

Wait for pending ml tasks in docs tests #44123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/reference/data-frames/apis/put-transform.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ PUT _data_frame/transforms/ecommerce_transform
}
--------------------------------------------------
// CONSOLE
// TEST[skip: https://github.com/elastic/elasticsearch/issues/43271]
// TEST[setup:kibana_sample_data_ecommerce]

When the transform is created, you receive the following results:
[source,js]
Expand Down
1 change: 0 additions & 1 deletion docs/reference/ml/apis/put-job.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,6 @@ PUT _ml/anomaly_detectors/total-requests
}
--------------------------------------------------
// CONSOLE
// TEST[skip: https://github.com/elastic/elasticsearch/issues/43271]

When the job is created, you receive the following results:
[source,js]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import org.elasticsearch.common.xcontent.XContentLocation;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.common.xcontent.XContentParser.Token;
import org.elasticsearch.test.rest.ESRestTestCase;
import org.elasticsearch.test.rest.yaml.ClientYamlDocsTestClient;
import org.elasticsearch.test.rest.yaml.ClientYamlTestCandidate;
import org.elasticsearch.test.rest.yaml.ClientYamlTestClient;
Expand All @@ -41,6 +42,7 @@
import org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase;
import org.elasticsearch.test.rest.yaml.restspec.ClientYamlSuiteRestSpec;
import org.elasticsearch.test.rest.yaml.section.ExecutableSection;
import org.junit.After;

import java.io.IOException;
import java.util.ArrayList;
Expand Down Expand Up @@ -97,6 +99,23 @@ protected ClientYamlTestClient initClientYamlTestClient(
return new ClientYamlDocsTestClient(restSpec, restClient, hosts, esVersion, masterVersion, this::getClientBuilderWithSniffedHosts);
}

@After
public void cleanup() throws Exception {
if (isMachineLearningTest() || isDataFrameTest()) {
ESRestTestCase.waitForPendingTasks(adminClient());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how bad it'd be to do this after every test. I don't feel great about relying on stuff in the test name. It just feels a bit too magical.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little bit complicated because Rollups do the wait in the base ESRestTestCase

Additionally some tests leave tasks running. get-follow-info.asciidoc line 38 is a good example as it creates various CCR tasks which will be waited on indefinitely unless the test teardown is run. Interestingly what appears to be happening is the @After method of this class is called before the test teardown

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly what appears to be happening is the @After method of this class is called before the test teardown

Weird!

I'm not a big fan of leaving things running in those tests either. Is there a way you could do something like the rollups here? It looks like it only cares about rollup style jobs. Does ml have something similar?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah rollups filter the waiting tasks with taskName.startsWith("xpack/rollup/job") and we can do something similar with ml jobs but the action causing the leakage in #43271 is indexing a document not an ml task. Waiting for all tasks catches unexpected issues and actually helps debugging tests that have failed due to leakage from a previous test, experience from using this in XPackRestIT has shown that it is very valuable.

If I remove the if (isMachineLearningTest() || isDataFrameTest()) { check then the tests that fail with pending tasks are ccr and rollup. I'll look into what's happening there and maybe there is a way of removing the _if ml ... _ conditional

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the Rollup and CCR tests, unfortunately it is not possible to wait for pending tasks after every test because those tests require special handling. I cannot see a way to simplify the logic and I think the current code is best as it is explicitly for the ml & data frame tests.

Also as more xpack feature snippet testing is added I would expect more usages of the pattern e.g. if (isSecurityTest()) { // security specific cleanup

Using the test name to determine if the test is an ml test is a valid use. XPackRestIT set the precedent some time ago and it has not caused problems there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not a fan of looking at the test name. I know XPackRestIT does it and I think it is sneaky black magic that will cause tests to fail in very difficult ways to trace. One badly named test invoking ml will cause subsequent tests to fail. Sometimes. Randomly.

I'm ok with merging this, but I'd really like a follow up issue to remove it somehow. Because I'm 100% sure somebody is going to lose many hours to debugging errors caused by a funny named test one day.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you detect a data frame test or ML test by looking at the public API somehow? Like by looking for jobs or something.....

}
}

protected boolean isMachineLearningTest() {
String testName = getTestName();
return testName != null && (testName.contains("/ml/") || testName.contains("\\ml\\"));
}

protected boolean isDataFrameTest() {
String testName = getTestName();
return testName != null && (testName.contains("/data-frames/") || testName.contains("\\data-frames\\"));
}

/**
* Compares the results of running two analyzers against many random
* strings. The goal is to figure out if two anlayzers are "the same" by
Expand Down