Skip to content

[ML] Switch data frame analytics memory estimate from KB to MB #1110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
droberts195 opened this issue Mar 31, 2020 · 1 comment · Fixed by #1126
Closed

[ML] Switch data frame analytics memory estimate from KB to MB #1110

droberts195 opened this issue Mar 31, 2020 · 1 comment · Fixed by #1126
Assignees

Comments

@droberts195
Copy link
Contributor

For anomaly detection memory limits/estimates are done in terms of whole megabytes.

For data frame analytics the estimates are done in kilobytes and limits can be as granular as single bytes.

Originally estimates were going to be in tenths of megabytes, because the feeling was that many analyses would use less than a megabyte and rounding up would be too wasteful. Since tenths of megabytes to not round nicely to byte values we decided to go with kilobytes instead.

However, as #1106 showed, even quite simple analyses are using ~10MB, so rounding up to the next megabyte would not be a major problem.

Rounding estimates to the nearest megabyte would avoid the excessive precision problem noted in elastic/elasticsearch#54506. It would also improve consistency with anomaly detection.

We would still have to accept limits set in units other than megabytes, as jobs that have such limits will already exist. However, we could nudge future jobs towards using whole numbers of megabytes for their memory limits by always returning estimates as whole numbers of megabytes.

/cc @peteharverson

@droberts195
Copy link
Contributor Author

We discussed this and agreed to switch to MB. I will open a PR to do this.

droberts195 added a commit to droberts195/ml-cpp that referenced this issue Apr 6, 2020
Previously data frame analytics memory estimates were
rounded to the nearest kilobyte, but this results in
excessive precision for large analyses.  This changes
the estimates to always be reported in whole megabytes,
rounded up from the low level estimate.

Closes elastic#1110
Closes elastic/elasticsearch#54506
droberts195 added a commit that referenced this issue Apr 7, 2020
Previously data frame analytics memory estimates were
rounded to the nearest kilobyte, but this results in
excessive precision for large analyses.  This changes
the estimates to always be reported in whole megabytes,
rounded up from the low level estimate.

Closes #1110
Closes elastic/elasticsearch#54506
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant