You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, our upfront memory usage estimate is an upper bound and a significant over estimate in most cases. This issue covers a couple of quick wins for improving the estimate:
Pass the training percentage: we need to know the number of rows used to train
Pass number of feature values for each feature: this would enable to better estimate how much memory we'll use for aggregate loss derivatives
Account for maximum number of features we will select
The SHAP's memory usage is not current with the leaf statistics memory usage
A better strategy (longer term) would be, rather than estimating a memory upper bound, estimate a value which training is very unlikely to exceed. This would require that we support circuit breaking during training. Since we snapshot state periodically the user would still be able to retrospectively increase the memory limit and restart analysis.
The text was updated successfully, but these errors were encountered:
We've done quite a lot of work on memory usage since this issue was created. Whilst it would be possible to refine estimates if we knew certain features had relatively few distinct values this is not a priority at present. We can revisit if we decide we need better memory estimates in the future.
Currently, our upfront memory usage estimate is an upper bound and a significant over estimate in most cases. This issue covers a couple of quick wins for improving the estimate:
A better strategy (longer term) would be, rather than estimating a memory upper bound, estimate a value which training is very unlikely to exceed. This would require that we support circuit breaking during training. Since we snapshot state periodically the user would still be able to retrospectively increase the memory limit and restart analysis.
The text was updated successfully, but these errors were encountered: