-
Notifications
You must be signed in to change notification settings - Fork 3.5k
ddp_cpu breaks while lookinf for .module: ModuleAttributeError: 'BoringModel' object has no attribute 'module' #4356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the issue! Want to submit a fix? |
I took a shot at it, but realized this is part of a larger issue. There is not much documentation about what are the supported uses for
So the error reported in this issue refers to 1., but 2. also seems to fail. Here is one test to check (fails on master): @pytest.mark.parametrize("accelerator", [None, "ddp_spawn"])
def test_trainer_num_processes_without_ddp_cpu(tmpdir, accelerator):
trainer = Trainer(
default_root_dir=tmpdir,
weights_summary=None,
logger=False,
checkpoint_callback=False,
progress_bar_refresh_rate=0,
fast_dev_run=True,
num_processes=2,
accelerator=accelerator,
)
trainer.fit(EvalModelTemplate()) If what Im saying is correct, also means that this warning is wrong: Hopefully someone can clear up what is the expected behaviour and provide sensible warnings/errors as appropriate. Also improve the docs about what are the uses of |
This will probably get cleaned-up by the proposal here: #6090 |
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team! |
Closing in favor of #6090 which will clarify the accelerator arguments |
Uh oh!
There was an error while loading. Please reload this page.
🐛 Bug
https://colab.research.google.com/drive/1hMW-0sTTgK-r6xfdwuSDyRNJBH7YYm9V?usp=sharing
To Reproduce
Set
num_processes=2
in the trainer (without accelerator="ddp_cpu"`). I know this is an invalid combination but a user of my library got confused with the error.Expected behavior
The property
num_processes
is ignored as mentioned in the warning:UserWarning: num_processes is only used for distributed_backend="ddp_cpu". Ignoring it.
Environment
The text was updated successfully, but these errors were encountered: