Skip to content

[ML] Feature/forecast scale #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 14, 2018

Conversation

hendrikmuhs
Copy link

@hendrikmuhs hendrikmuhs commented May 4, 2018

This implements forecast model cloning overflowing to disk to allow forecasting of large jobs. The tmp storage location has to be given as parameter (X-Pack change: elastic/elasticsearch#30399). If the parameter is not given, this feature is disabled. For models smaller than the existing limit (20MB) nothing changes.

Todo:

  • changelog
  • x-pack PR

To be merged together once corresponding X-Pack PR and this PR are ready.

Release note: Forecasting of large machine learning jobs is now supported by temporarily storing
model state on disk

Hendrik Muhs added 3 commits April 17, 2018 18:28
This implements the C++ side of forecast persistence. An additional parameter allows the forecast runner to persist models on disk for temporary purposes. Models are loaded back into memory one by one.

For models smaller than the current limit of 20MB nothing changes.
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM subject to a few nits


// Note: This value measures the size in memory, not the size of the persistence,
// which is likely higher and would be hard to calculate upfront
//! max memory allowed to use for forecast models persisting to disk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the last line of this begins with //!. All the lines should be consistent.

// if you change this value also change the limit on X-pack side.
// The purpose of this value is to guard the rest of the system regarding
// an out of disk space
//! minimum disk space required for disk persistence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the last line of this begins with //!. All the lines should be consistent.

@@ -351,6 +351,9 @@ class MODEL_EXPORT CModelFactory {
//! component.
std::size_t componentSize() const;

// Get the minimum seasonal variance scale, specific to the model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a Doxygen comment

series.s_ToForecast.emplace_back(
feature, std::move(model), byFieldValue);
} else // restorer exhausted, no need for further restoring
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could move { after else but before comment

@hendrikmuhs hendrikmuhs merged commit dfd88ed into elastic:master May 14, 2018
hendrikmuhs pushed a commit that referenced this pull request May 14, 2018
This implements the C++ side of forecast persistence. An additional parameter allows the forecast runner to persist models on disk for temporary purposes. Models are loaded back into memory one by one.
@droberts195 droberts195 added review and removed WIP labels May 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants