Skip to content

[7.3][ML] Outlier detection should only fetch docs that have the analyzed … #44960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dimitris-athanasiou
Copy link
Contributor

…… (#44944)

As data frame rows with missing values for analyzed fields are skipped,
we can be more efficient by including a query that only picks documents
that have values for all analyzed fields. Besides improving the number
of documents we go through, we also provide a more accurate measurement
of how many rows we need which reduces the memory requirements.

This also adds an integration test that runs outlier detection on data
with missing fields.

elastic#44944)

As data frame rows with missing values for analyzed fields are skipped,
we can be more efficient by including a query that only picks documents
that have values for all analyzed fields. Besides improving the number
of documents we go through, we also provide a more accurate measurement
of how many rows we need which reduces the memory requirements.

This also adds an integration test that runs outlier detection on data
with missing fields.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@dimitris-athanasiou
Copy link
Contributor Author

@elasticmachine update branch

@dimitris-athanasiou dimitris-athanasiou merged commit cb25cf8 into elastic:7.3 Aug 1, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the outlier-detection-should-query-docs-that-have-all-analyzed-fields-7_3 branch August 1, 2019 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants