-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloudpickle is not stable in notebooks #538
Comments
Gentle ping! Any recommendations is much appreciated. |
Just to share more observed behavior, the output of the above code contains string like |
I gave it a try using the Here is an updated version of the snippet I used in my notebook cell to get a richer output: import cloudpickle
from pickletools import dis
from hashlib import sha256
MY_PI = 3.1415
def get_pi():
return MY_PI
dumped_get_pi = cloudpickle.dumps(get_pi)
print(sha256(dumped_get_pi).hexdigest())
print(dis(dumped_get_pi)) I used ipykernel version 6.29.5 and jupyterlab version 4.2.4. Anyways, I am not sure we want to make cloudpickle too magic w.r.t. the handling of jupyter's implementation details. |
Thanks @ogrisel for the reply. I updated the description to clarify that I am using colab notebook, and can reproduce the behavior today (cloudpickle version 3.1.1). |
Actually, I can reproduce on colab with: import cloudpickle
import pickletools
import hashlib
MY_PI = 3.1415
def get_pi():
return MY_PI
pickled = cloudpickle.dumps(get_pi)
print(hashlib.sha224(pickled).hexdigest())
print(pickletools.dis(pickled)) and indeed I found at least one random offender:
|
I can also reproduce locally in a jupyter notebook, but it only changes
when I restart the kernel, with a similar changing part:
```
136: \x8c SHORT_BINUNICODE '/tmp/ipykernel_38859/3419112748.py'
```
It seems to come from `obj.co_filename`, but I am actually not sure why we should preserve this.
This is a dynamic code object so changing the file name should not change the way it is pickled.
I would be in favor of overwriting this argument with something like `<dynamic>`, as the filename is probably never informative.
|
Consider this code in a colab notebook cell:
every time I rerun this cell I get a different output. This is unlike a Python script where it gives a consistent output. I am trying to use cloudpickle to capture the function and persist it in a storage for later use. I want to update the storage only when there is a material change in the behavior of the function, but because of this behavior in the notebook I am running into redundant updates of the storage which is costly. Is there a way I can avoid this?
The text was updated successfully, but these errors were encountered: