Skip to content

Add enable_device_summary flag to disable device printout #13378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
CompRhys opened this issue Jun 22, 2022 · 14 comments
Open

Add enable_device_summary flag to disable device printout #13378

CompRhys opened this issue Jun 22, 2022 · 14 comments
Assignees
Labels
callback feature Is an improvement or enhancement good first issue Good for newcomers trainer: argument
Milestone

Comments

@CompRhys
Copy link
Contributor

CompRhys commented Jun 22, 2022

🚀 Feature

Add enable_model_summary boolean kwarg to pl.Trainer() to supress _log_device_info()'s output.

Motivation

When calling predict within a surrogate model loop Trainer prints out the devices each time breaking apart intended tables etc or other outputs. Related to #13358 for cleaning-up/reducing stdout verbosity.

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
=======================================================
n_gen |  n_eval |  n_nds  |     eps      |  indicator  
=======================================================
    1 |     322 |       3 |            - |            -
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
    2 |    1322 |       4 |  0.625000000 |        ideal

Pitch

Add enable_model_summary kwarg to Trainer that defaults to True

Alternatives

The suggested solution is the simplest solution, any alternative would add more complexity.

Additional context

None


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda @awaelchli @ananthsub @rohitgr7 @justusschock @kaushikb11

@CompRhys CompRhys added the needs triage Waiting to be triaged by maintainers label Jun 22, 2022
@ananthsub
Copy link
Contributor

I don't think another flag should be added. Why not move the print out to the Trainer constructor so it's only printed once?

@CompRhys
Copy link
Contributor Author

CompRhys commented Jun 23, 2022

I don't think another flag should be added. Why not move the print out to the Trainer constructor so it's only printed once?

It already is. In linked issue (#13358) in order to reduce uncontrollable verbosity I was advised to create a secondary Trainer. There's no need to persist this trainer in the secondary optimisation loop so it gets deleted by the gc and reinitialised when needed.

In general afaik uncontrollable verbosity isn't ideal and in terms of unhelpful verbosity the number of TPUs, HPUs, and IPUs are less likely to be informative than the model summary which there is an option to suppress.

@carmocca carmocca added feature Is an improvement or enhancement discussion In a discussion stage trainer: argument and removed needs triage Waiting to be triaged by maintainers labels Jun 23, 2022
@awaelchli
Copy link
Contributor

I'm curious, is there a desire to have verbosity controlled on a more global level, not just the summary here?

@CompRhys
Copy link
Contributor Author

I'm curious, is there a desire to have verbosity controlled on a more global level, not just the summary here?

I think that between the enable options for model summary and progress bars (making new trainer when needed to adjust) and the possibility to make things PotentialUserWarnings you can control pretty much anything apart from this device summary.

A verbose=int setup c.f. sklearn could work but would be a much bigger change

@CompRhys
Copy link
Contributor Author

Happy for me to change the associated PR to be reviewed?

@awaelchli
Copy link
Contributor

@CompRhys thanks for the PR. Since this is adding a flag to the core API in Trainer, we need to discuss it with the core team @Lightning-AI/core-lightning and get some more opinions.

I think there are also a few options we haven't explored yet.

  1. Once could move the device logs to a configurable callback like the model summary or trainer summary.
  2. Let the messages be more easily filtered through logging
  3. Introduce a verbose flag to control messaging through Trainer more generally (e.g. fastdevrun infos)

@rohitgr7
Copy link
Contributor

  1. Let the messages be more easily filtered through logging
  1. Introduce a verbose flag to control messaging through Trainer more generally (e.g. fastdevrun infos)

these 2 seem better sol. I'd prefer 3 if there are more logs we could configure.

@justusschock
Copy link
Member

I'd prefer a combination of 1. and 2.

IMO it really is not necessary to be able to have that baked into the core trainer (same as model summary was not necessary).

And having it more easily filtered would also be great (I tried to forward the streams to something else to just avoid the prints and that also didn't work).

@carmocca
Copy link
Contributor

Today, it can be silenced by doing this:

import logging

def device_info_filter(record):
    return "PU available: " not in record.getMessage()

logging.getLogger("pytorch_lightning.utilities.rank_zero").addFilter(device_info_filter)

I find the callback idea (1) a bit overkill.
With (2) we can improve the above, maybe by using a Trainer logger instead of the rank zero logger.
(3) seems like it has a larger scope. It would be interesting to see what are your concrete ideas. But for the device info message in particular, we've always agreed that it should be shown and not just when fast_dev_run=True

@stale
Copy link

stale bot commented Jul 31, 2022

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, PyTorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label Jul 31, 2022
@carmocca
Copy link
Contributor

I changed my mind. I think the callback proposal is the simplest and most extensible option. This would also resolve #11014. And we could have flags in the callback to disable specific prints.

@stale stale bot removed the won't fix This will not be worked on label Sep 11, 2022
@carmocca carmocca added this to the pl:future milestone Sep 11, 2022
@carmocca carmocca added good first issue Good for newcomers callback and removed discussion In a discussion stage labels Sep 11, 2022
@shenoynikhil
Copy link
Contributor

I think I can take this up!

@CompRhys
Copy link
Contributor Author

Did anything come from this? My initial PR was never reviewed -- #13379

@carmocca carmocca removed their assignment Jul 30, 2024
@zouharvi
Copy link

zouharvi commented Nov 22, 2024

Seems like it made its way upstream? https://lightning.ai/docs/pytorch/stable/common/trainer.html#enable-model-summary

Apologies, confused it with enable_device_summary. Would make sense to be in the same place though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
callback feature Is an improvement or enhancement good first issue Good for newcomers trainer: argument
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants