Model Size is big in auto sklearn #1359

shabir1 · 2021-12-27T08:16:12Z

Model Size is big in auto sklearn

Auto Sklearn model size is big with respect to sklearn, Below are the examples:

1. With ensemble_size=30
AutoSklearnRegressor(
                    ensemble_nbest=32, 
                    ensemble_size=30,
                     include={'data_preprocessor': ['NoPreprocessing'],
                              'feature_preprocessor': ['no_preprocessing']},
                     max_models_on_disc=32, per_run_time_limit=100,
                     time_left_for_this_task=350)

Model size: 789MB

2. With ensemble_size=10
AutoSklearnRegressor( 
                               ensemble_nbest=12, 
                               ensemble_size=10,
                     include={'data_preprocessor': ['NoPreprocessing'],
                              'feature_preprocessor': ['no_preprocessing']},
                     max_models_on_disc=12, per_run_time_limit=100,
                     time_left_for_this_task=350)

Model size: 786MB

3. With ensemble_size=1
AutoSklearnRegressor(
                             ensemble_nbest=3, 
                             ensemble_size=1,
                     include={'data_preprocessor': ['NoPreprocessing'],
                              'feature_preprocessor': ['no_preprocessing']},
                     max_models_on_disc=3, per_run_time_limit=100,
                     time_left_for_this_task=350)
Selected Model:  Random Forest
Model size: 777MB

4. Run Sklearn Model
ExtraTreesRegressor(n_estimators=30,  random_state=0)
Model size: 58MB

5. Run Sklearn Model (Same as the AutoSklearnRegressor with ensemble size 1., Run the same selected model with the same parameters but different model sizes with a huge difference 122MB and 777MB)
RandomForestRegressor(bootstrap=True,   criterion='mse' )
Model size: 122MB

I run autoskearn without feature or data preprocessing but still model size is very huge.
If it is due to ensemble size then I tried with different values of ensemble size 30, 10, 1 but the model size is almost the same, Why?

The text was updated successfully, but these errors were encountered:

eddiebergman · 2022-01-10T09:30:32Z

This is likely due to all the imports, saved predictions and everything else we use for optimization bundled into the object. I would not use the full auto-sklearn model in production at the moment and instead try to export or retrain the found models.

shabir1 · 2022-01-10T09:58:31Z

@eddiebergman You mean I will get the best configurations and create an sklearn ensemble model out of it.

eddiebergman · 2022-01-10T10:01:47Z

There are ways to do that by extracting out the models, this is made easier in a recent PR by @userfindingself in #1321. This is available in the development branch, otherwise you can view the code there to suit your needs. We would like an export option down the road to eventually export more production ready models.

shabir1 · 2022-01-10T10:15:12Z

@eddiebergman In the next version of auto-sklearn can we get the final model with small size for prediction.

eddiebergman · 2022-01-10T10:45:30Z

We are reworking some internals, this will likely not be a feature in the next release, apologies.

shabir1 · 2022-01-10T11:30:34Z

@eddiebergman Thank you sir for your quick reponse

mfeurer · 2022-01-10T11:50:30Z

One additional note: Auto-sklearn uses 512 trees, while scikit-learn by default only uses 100 trees. This explains why the models are in the ballbark of > 500MB.

eddiebergman added the question label Jan 10, 2022

shabir1 closed this as completed Jan 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model Size is big in auto sklearn #1359

Model Size is big in auto sklearn #1359

shabir1 commented Dec 27, 2021

eddiebergman commented Jan 10, 2022

Uh oh!

shabir1 commented Jan 10, 2022

Uh oh!

eddiebergman commented Jan 10, 2022

Uh oh!

shabir1 commented Jan 10, 2022 •

edited

Loading

Uh oh!

eddiebergman commented Jan 10, 2022

Uh oh!

shabir1 commented Jan 10, 2022

Uh oh!

mfeurer commented Jan 10, 2022

Uh oh!

Model Size is big in auto sklearn #1359

Model Size is big in auto sklearn #1359

Comments

shabir1 commented Dec 27, 2021