-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[RFC] Better support using multiple loggers simultaneously by deprecating LoggerCollection
#11232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See this issue as an area where we need to access each individual logger's group separator: #11254 This is annoying to do with LoggerCollections today since it would require iterating over the iterables available inside of the logger collection. It would also force my code to be aware of LoggerCollection vs using a single logger, which defeats the whole point of having the LoggerCollection interface. It'd be much much easier to iterate over the available loggers like this: for logger in trainer.loggers:
converted_metrics = _prefix_metrics(metrics_dict, prefix, logger.group_separator)
logger.log_metrics(converted_metrics, trainer.global_step) As this code will work for a single logger or multiple. |
LoggerCollection
LoggerCollection
Hey @ananthsub, Sounds like a good suggestion. This could lead to some breaking change. |
@ananthsub this is better than |
Currently, the Trainer accepts an iterable of loggers, so passing a
Where would the user be accessing the logger from? I proposed using |
Do you plan to add a deprecation path for the setter? Also, do you plan to rename |
@carmocca good questions.
This update should also be mirrored in EDIT: I don't see how we can manage this if someone does this:
what's the expectation for
One option is to go all in with
from an API standpoint, this is the cleanest and most consistent. but it also introduces more breaking changes to the Trainer and LightningModule, which I understand users are sensitive to. Over time it reduces the risk of bugs though, like keeping the attributes in sync |
Do we have an estimate of what percentage of users use multiple loggers vs one logger? |
I vote for going all in with For the cases where users set If someone sets |
I think the setters for |
This is not allowed. We need to move away from this in the future anyway. The logger has to be owned by the logger connector and not be set on the Trainer indirectly. It is what we wanted to do for a while but never got to it. |
The logger vs. loggers proposal here has some open questions which I believe we should address first before we set the api changes in motion. If we previously had
and in my LightningModule:
Then now with the new change, if I change to Furthermore, there is the question of whether to go "all in" with the loggers property. But what that means for the user is that they would have to write
99% use one logger. 1% uses 2 loggers. nobody uses more than 2. Motivation by @ananthsub:
To emphasize, the last part here "users have to [...] update their code to handle cases where a single logger is used vs multiple." is neither handled well by the current LoggerCollection nor does it get solved by the proposals here. Shouldn't we investigate a design that lets us switch out the loggers without being forced to change the LM code? |
@awaelchli - one solution for that is to pick an arbitrary What we could do is redefine if len(loggers) == 0:
return None
if len(loggers) == 1:
return loggers[0]
else:
# emit warning of using logger when multiple loggers are configured, with this behavior changing in v1.8 to return the first logger in the list
return LoggerCollection(loggers) and afterward we do this: if len(loggers) == 0:
return None
if len(loggers) > 1:
# emit warning of using logger when multiple loggers are configured
return loggers[0] In this case, we don't have to deprecate the single |
@ananthsub I was in the middle of suggesting a similar solution right as you posted it. I agree with @awaelchli that deprecating I think maintaining both |
To summarize:
Regarding the trainer API, I also don't see a strong need to do this migration: |
That sounds good. @awaelchli please confirm if this makes sense to you, so that @akashkw can finish his PR |
I am ok with that, we just need to be fully aware that if we remove the LoggerCollection abstraction, we won't be able to improve the situation where the user needs to change their code to make the LM compatible with the trainer settings, at least not with the current logger reference, and unless we recommend to always add that for loop. In summary, the implications are: Before: # agnostic to number of loggers in trainer (except 0)
self.logger.log_image(...) After:
This becomes the standard of accessing custom logging methods, the for loop will always be required to make it compatible with 0, 1, and multiple loggers. The only way I see to avoid the "boilerplate" for loop without a logger collection abstraction is to add the methods log_text, log_image, etc. to the LM itself, which internally would loop over the loggers. |
I'm not a fan of this for a few reasons:
|
I've outlined some of the use cases before and after we deprecate Methods that:
Before deprecating
# Most common use case, which is very straightforward
self.logger.log_metrics(...)
self.logger.log_image(...)
self.logger.version(...)
self.logger.log_metrics(...)
# Can't control which loggers have log_image called
# Other than NeptuneLogger and WandbLogger, all loggers raise NotImplementedError
self.logger.log_image(...)
# Concatenates the versions of each logger, which probably isn't that useful
self.logger.version(...) After deprecating
# Most common use case, which is still very straightforward
self.logger.log_metrics(...)
self.logger.log_image(...)
self.logger.version(...)
for logger in self.loggers:
logger.log_metrics(...)
# Can control which loggers have log_image called
for logger in self.loggers:
if isinstance(logger, NeptuneLogger):
logger.log_image(...)
# User can do whatever they want with logger.version
for logger in self.loggers:
if logger.version != required_version:
raise Exception() |
Uh oh!
There was an error while loading. Please reload this page.
Proposed refactor
Deprecate
LoggerCollection
:https://github.com/PyTorchLightning/pytorch-lightning/blob/a6a28e08d22f59f2468ff1049c67507df7519e7c/pytorch_lightning/loggers/base.py#L370-L464
Motivation
The logger collection implements the base logger interface and wraps a sequence of loggers. This way, the trainer always behaves as though a single logger is being used. Such a collection is somewhat handy for writing data, but is much less helpful when reading state from loggers, especially when individual loggers have different state/property values. For instance, It's not clear at all what
LoggerCollection.name
should be. The concatenation of all its instance loggers names? The first logger's name? This also came up here: #10954This leads to inconsistencies across components. Namely, Lightning offers no such LoggerCollection-equivalent for Callbacks. The trainer treats them as a flat list, and when hooks need to be called, the Trainer iterates over this list, and calls the callbacks' hooks successively. Why do we not do the same for loggers?
This leads to leaks in the user-facing API. If users pass a list of loggers to the
Trainer
'slogger
flag, accessingtrainer.logger
returns a single instance of typeLoggerCollection
, which is different than what users specified, which means users have to learn about this abstraction and likely update their code to handle cases where a single logger is used vs multiple.Pitch
loggers
property on the Trainer which returns a list of the passed in Loggers objects, so users have an option to not use the LoggerCollectionLoggerCollection
in favor of thisloggers
property. Wherever we call the Logger APIs from within the Trainer, iterate over theloggers
list and call the hooks instead.Example:
Current
https://github.com/PyTorchLightning/pytorch-lightning/blob/a6a28e08d22f59f2468ff1049c67507df7519e7c/pytorch_lightning/trainer/trainer.py#L1444-L1445
New
LoggerCollection
is removed, we could redefinelogger
on the Trainer to be something like this. Or we could deprecatelogger
entirely in favor ofloggers
to avoid any potential misuse/missed expectations around supporting multiple loggers.Originally posted by @ananthsub in #11209 (comment)
Pitch
cc @justusschock @awaelchli @akihironitta @edward-io @Borda @ananthsub @tchaton
The text was updated successfully, but these errors were encountered: