You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Import the Iowa liquor sales dataset using the file data visualizer (ping me if you need the file)
Create a new regression data frame analytics job to analyze it, setting training percent to 5, dependent variable to Sale (Dollars) and excluding every variable exceptBottle Volume (ml) and Store Number from the analysis. (So effectively we're predicting one number from two others on 5% of the 380000 rows, i.e. 19000 rows.)
Start the analysis and wait for it to finish.
Look at the job details. The memory limit recommended by the UI was around 1.2GB. The actual memory required was less than 12MB.
Part of the problem here is elastic/kibana#60496, because the memory estimate didn't get updated when I added the exclude fields. However, a considerable part of the problem is in the C++ estimation code. If I run the estimate in dev console using the final config it's still 25 times bigger than it needs to be:
And from the second screenshot you can see actual was 12322863 bytes ~= 12034kb.
This is a big problem for Cloud trials where users don't have much memory to play with, and we refuse to run an analysis if its memory estimate won't fit onto the available machine.
The text was updated successfully, but these errors were encountered:
This is partly a known issue: we need to communicate the training percentage to the memory estimation process, since this very significantly affects the actual memory usage.
We've discussed this and we're going to work on calibrating the current worst case memory estimates based on a variety of different classification and regression runs.
Steps to reproduce:
Sale (Dollars)
and excluding every variable exceptBottle Volume (ml)
andStore Number
from the analysis. (So effectively we're predicting one number from two others on 5% of the 380000 rows, i.e. 19000 rows.)Part of the problem here is elastic/kibana#60496, because the memory estimate didn't get updated when I added the exclude fields. However, a considerable part of the problem is in the C++ estimation code. If I run the estimate in dev console using the final config it's still 25 times bigger than it needs to be:
returns:
And from the second screenshot you can see actual was 12322863 bytes ~= 12034kb.
This is a big problem for Cloud trials where users don't have much memory to play with, and we refuse to run an analysis if its memory estimate won't fit onto the available machine.
The text was updated successfully, but these errors were encountered: