-
Notifications
You must be signed in to change notification settings - Fork 91
Add ability to use built-in pickle
for saving AutoMLSearch
#2463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c6eef38
dd4d6d7
299b140
962382d
c1352e7
72da20e
004b69d
4b0e2cd
e0cc73a
833e034
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
import copy | ||
import pickle | ||
import sys | ||
import time | ||
import traceback | ||
|
@@ -1307,31 +1308,50 @@ def best_pipeline(self): | |
|
||
return self._best_pipeline | ||
|
||
def save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL): | ||
def save( | ||
self, | ||
file_path, | ||
pickle_type="cloudpickle", | ||
pickle_protocol=cloudpickle.DEFAULT_PROTOCOL, | ||
): | ||
"""Saves AutoML object at file path | ||
|
||
Arguments: | ||
file_path (str): location to save file | ||
pickle_type {"pickle", "cloudpickle"}: the pickling library to use. | ||
pickle_protocol (int): the pickle data stream format. | ||
|
||
Returns: | ||
None | ||
""" | ||
if pickle_type == "cloudpickle": | ||
pkl_lib = cloudpickle | ||
elif pickle_type == "pickle": | ||
pkl_lib = pickle | ||
else: | ||
raise ValueError( | ||
f"`pickle_type` must be either 'pickle' or 'cloudpickle'. Received {pickle_type}" | ||
) | ||
|
||
with open(file_path, "wb") as f: | ||
cloudpickle.dump(self, f, protocol=pickle_protocol) | ||
pkl_lib.dump(self, f, protocol=pickle_protocol) | ||
|
||
@staticmethod | ||
def load(file_path): | ||
def load( | ||
file_path, | ||
pickle_type="cloudpickle", | ||
): | ||
"""Loads AutoML object at file path | ||
|
||
Arguments: | ||
file_path (str): location to find file to load | ||
pickle_type {"pickle", "cloudpickle"}: the pickling library to use. Currently not used since the standard pickle library can handle cloudpickles. | ||
|
||
Returns: | ||
AutoSearchBase object | ||
""" | ||
with open(file_path, "rb") as f: | ||
return cloudpickle.load(f) | ||
return pickle.load(f) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we can save with cloudpickle and read with pickle?! Wow, I had no idea that works hehe. I feel like we should accept an argument here for the "pickle type"? Feels weird to offer a choice of library for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm yeah I agree, it does feel symmetrical to do so. I think it would be a no-op though since it looks like the doc for cloudpickle just recommends using the standard python pickler for loading. |
||
|
||
def train_pipelines(self, pipelines): | ||
"""Train a list of pipelines on the training data. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user uses a cloudpickle protocol while trying to use pickle? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently cloudpickles can be opened by the regular pickling library (according to their example on their README.md and the doc string for
cloudpickle.py
in the attached screenshot)! I didn't know this before so that's pretty neat.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So cool 🤩