-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Do not use deepcopy to copy nn.modules (including entire models) #2177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc: @mthrok |
Thanks for creating this issue @mikekgfb. Will submit a PR with the fixes soon. |
@svekars I don't think this issue should be closed. The PR I sent out fixed the issue for https://pytorch.org/tutorials/beginner/transformer_tutorial.html. But there are several other places where it's still being used. I.e. https://github.com/search?q=repo%3Apytorch%2Ftutorials%20copy.deepcopy(&type=code |
Can the owners of the tutorials please take a look and submit corrections? If some of these tutorials describe an obsolete feature, please submit a PR to remove them:
|
For quantization, we were assuming that |
cc @qihqi |
Q: |
Are you sure deepcopy is not supposed to be defined for nn.modules? https://github.com/pytorch/pytorch/blob/master/torch/optim/swa_utils.py appears to be using deepcopy. |
Thanks for the information. |
copy.deepcopy is not defined for nn.module() and does not reliably copy an nn.module hierrchy, such as a model or partial model. Our tutorials should using
copy.deepcopy()
as this will induce our users to make incorrect use of the primitive.Please update https://pytorch.org/tutorials/beginner/transformer_tutorial.html and other tutorials to avoid the use. Recommended way for snapshotting is to use torch.save() for trained models. to create multiple clones of an untrained model, create multiple copies from first principles from the model architectural parameters.
cc @pytorch/team-text-core @Nayef211
The text was updated successfully, but these errors were encountered: