[RFC] Introduce Plugin Registry #6923

SeanNaren · 2021-04-09T12:30:05Z

🚀 Feature

Introduce a registry, where plugins and users can register their own plugins to alias when creating the trainer (cc @tchaton who came up with the idea).

from pytorch_lightning.plugins import plugins_registry

print(plugins_registry.available_plugins(with_descriptions=True, with_recommendations=True))

# prints
# ddp_sharded: description ..., recommended when: ...
# deepspeed: description ..., recommended when: ...
# deepspeed_zero_3: description ..., recommended when: ...
# deepspeed_zero_3_offload: description ..., recommended when: ...
# ...

pl.Trainer(plugins='deepspeed_zero_3_offload')

Motivation

When using https://github.com/SeanNaren/minGPT I was testing across a bunch of different configurations across DeepSpeed and Sharded training. I was doing it manually, by specifying the plugin and the options available:

    trainer = Trainer.from_argparse_args(
        args,
        log_every_n_steps=1,
        max_epochs=1,
        # plugins=DDPPlugin(find_unused_parameters=True), # keep these for later to run
        # plugins=DDPShardedPlugin(ddp_comm_hook=...), # keep these for later to run
        plugins=DeepSpeedPlugin(stage=2, cpu_offload=False, logging_level=logging.INFO),
        gradient_clip_val=1.0,
        checkpoint_callback=False,
        callbacks=[lr_decay, CUDACallback()],
    )

This is extremely cumbersome and we're seeing a similar trend appear for other libraries as well, case or case. Having aliases allows users to use the trainer without having to modify the code. This is also important for cloud train agent solutions that rely on this flexibility from repos.

To support this in code right now, I either A) have to integrate a config tool to allow me to instantiate the plugins (Hydra) or B) add a bunch of branching statements and argparse.

Given the goal of Lightning is to reduce boilerplate, my proposal moves this boilerplate into Lightning rather than user code via a registry.

Alternatives

Force users to use an instantiating package such as Hydra in their own repo or implement their own branching logic to select a plugin.

Points from @carmocca discussion. I don't have answers to all these and most are not important now, but are important once this registry idea has been approved from the high level API.

How many plugins would we provide?

Leave it up to the plugin to decide. It would be nice to have an interface for the plugin (maybe an inherited classmethod) that defines the alias to add the plugin, and combinations to a registry.

Where would the registry be?

I think this question boils down to if we want a general registry across PL or separate registries. The registry should live in PL, and for a first iteration should be preferably tied to plugins for now. I can see further iterations if the idea proves more value to move to other arguments.

Should we do this for other trainer arguments?

Based on the above, I'd start with plugins first before moving onto all other arguments primarily to ensure the idea proves value. I can imagine a world where the registry provides easier access to certain objects that can be instantiated easily.

How would users register their own plugins?

Using the same registry. Preferably with a function or context manager.

Where would we register our plugins?

Like done in many other repos, we can add our pre-built plugins by a simple search across the plugins directory.

Will this introduce too much clutter?

To me this is a non-issue as I prioritise reducing user code clutter over lightning code clutter, and if aliases are handled by plugins, this means the responsibility is distributed. Currently to use custom plugins and the various configurations, it's impractical. If we'd like to continue reducing boilerplate we'll need some form of registry to make it easier to enable plugins with slightly different arguments.

ethanwharris · 2021-04-09T17:40:45Z

This seems very nice 😃 Couple comments / questions:

So, the user can add a plugin to the registry + a bunch of plugin-specific other configs. Is the use case that you maybe have a bunch of different machines where you like to use different training types (but also precisions I guess) with their own settings?
What is the proposed API for adding to it? It would be neat if, in addition to adding one at a time, you could have an outer level registry (e.g. in another repo or python file or something) and do e.g.:

from ... import plugins_registry as my_registry
plugins_registry.add(my_registry)

ananthsub · 2021-04-09T18:57:27Z

@ethanwharris agreed, this could work like fsspec's registration for how Lightning supports different cloud storage options

FWIW I think this could extend to any argument in the trainer that takes either a str or an instance of a class. This could equally apply to profilers or accelerators as well

SeanNaren · 2021-04-10T11:07:36Z

I've updated the OP with more information, please take a look!

Outer registry is a cool idea and I think the natural extension if the plugins registry works well and proves value. I've primarily selected plugins because as a user, this is what I need the most. However if we can built a general enough API then it would be sweet to add this; I'd just try to reduce the requirement of code changes throughout lightning if possible.

cc @carmocca I've added discussion points above.

carmocca · 2021-04-12T23:29:20Z

Thank you so much @SeanNaren for addressing all the points in the top post.

How does this all fit with #6090? Would we just rename plugins to training_type?

What about argument order? Should both deepspeed_zero_3_offload and deepspeed_offload_zero_3 work?

Before moving forward with the implementation, I'd like that we exactly define all options that would be returned by (using your top post snippet):

plugins_registry.available_plugins()

Also, is there value in with_descriptions=True, with_recommendations=True? Wouldn't that just be duplicating docs?

SeanNaren · 2021-04-13T09:43:31Z

Thanks @carmocca!

How does this all fit with #6090? Would we just rename plugins to training_type?

Great question, I think that would be preferred. This would then become a TrainingTypeRegistry, but for the most part be an internal refactor + ensuring references exist to the original name for deprecation.

What about argument order? Should both deepspeed_zero_3_offload and deepspeed_offload_zero_3 work?

It should be up to the plugin to define the finite list of plugin configurations it supports. These should be documented as such within the plugins own documentation. In this specific example, this would be defined as deepspeed_stage_3_offload as this is the correct naming based on the upstream library.

Before moving forward with the implementation, I'd like that we exactly define all options that would be returned by (using your top post snippet):

Given the current plugin registry list is quite small, I can outline the current plugins the registry will support out of the box with PL (including FSDP). Note that this doesn't include AMP, Apex or the precision plugins and doesn't reflect how it would look in a few months if other plugins are added. Also note that we include the pre-defined flags, as now these can be defined in the registry, cleaning up having to define the plugin alias in a single Enum (as currently done in PL).

plugins_registry.available_plugins()

# ddp_sharded
# ddp_fully_sharded
# ddp_fully_sharded_auto_wrap
# deepspeed
# deepspeed_stage_2_offload
# deepspeed_stage_3
# deepspeed_stage_3_offload

# Vanilla plugins
# ddp
...

Also, is there value in with_descriptions=True, with_recommendations=True? Wouldn't that just be duplicating docs?

I agree. Including descriptions/recommendations isn't a really matured idea and for a first pass iteration would be unnecessary imo.

ananthsub · 2021-04-16T17:36:43Z

@SeanNaren @kaushikb11 should we also add an entry for DDP to deal with find_unused_parameters? Then we could make it even easier for people to tap into these perf savings: https://pytorch-lightning.readthedocs.io/en/latest/benchmarking/performance.html#when-using-ddp-set-find-unused-parameters-false

kaushikb11 · 2021-04-16T17:39:21Z

@ananthsub Yup, definitely! That makes sense. It would be convenient for the Users.

SeanNaren · 2021-05-04T20:08:57Z

This was done in #6982 :)

SeanNaren added feature Is an improvement or enhancement help wanted Open to be worked on discussion In a discussion stage labels Apr 9, 2021

SeanNaren self-assigned this Apr 9, 2021

kaushikb11 self-assigned this Apr 12, 2021

kaushikb11 mentioned this issue Apr 14, 2021

Add Training Type Plugins Registry #6982

Merged

11 tasks

SeanNaren closed this as completed May 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Introduce Plugin Registry #6923

[RFC] Introduce Plugin Registry #6923

SeanNaren commented Apr 9, 2021 •

edited

Loading

ethanwharris commented Apr 9, 2021

ananthsub commented Apr 9, 2021 •

edited

Loading

SeanNaren commented Apr 10, 2021

carmocca commented Apr 12, 2021

SeanNaren commented Apr 13, 2021

ananthsub commented Apr 16, 2021

kaushikb11 commented Apr 16, 2021

SeanNaren commented May 4, 2021

[RFC] Introduce Plugin Registry #6923

[RFC] Introduce Plugin Registry #6923

Comments

SeanNaren commented Apr 9, 2021 • edited Loading

🚀 Feature

Motivation

Alternatives

How many plugins would we provide?

Where would the registry be?

Should we do this for other trainer arguments?

How would users register their own plugins?

Where would we register our plugins?

Will this introduce too much clutter?

ethanwharris commented Apr 9, 2021

ananthsub commented Apr 9, 2021 • edited Loading

SeanNaren commented Apr 10, 2021

carmocca commented Apr 12, 2021

SeanNaren commented Apr 13, 2021

ananthsub commented Apr 16, 2021

kaushikb11 commented Apr 16, 2021

SeanNaren commented May 4, 2021

SeanNaren commented Apr 9, 2021 •

edited

Loading

ananthsub commented Apr 9, 2021 •

edited

Loading