API change, expose model's state_dict to `accelerator.training_type_plugin` #7470

shuyingsunshine21 · 2021-05-10T18:03:05Z

🚀 Feature

Currently, in CheckpointConnector.dump_checkpoint, we have

model = self.trainer.lightning_module

checkpoint = {
    'epoch': current_epoch,
    'global_step': global_step,
    'pytorch-lightning_version': pytorch_lightning.__version__,
    'state_dict': model.state_dict(),
}

so model's state dict is extracted here. However, let accelerator.training_type_plugin control the logic might make more sense especially for sharded plugin, we might need to access the local (i.e. sharded) state instead of the whole states.

Motivation

#6152 (comment)

we would like to make customized model state dict for specific training type plugin, we could override the training_type_plugin.on_save method to modify the state dict, but this would cause duplicate call for extracting model state dict.

Pitch

define a new method for TrainingTypePlugin

def state_dict(self) -> dict:
     model = self.lightning_module
     return model.state_dict()

and in CheckpointConnector.dump_checkpoint,

checkpoint = {
    'epoch': current_epoch,
    'global_step': global_step,
    'pytorch-lightning_version': pytorch_lightning.__version__,
    'state_dict': self.trainer.accelerator.training_type_plugin.state_dict(),
}

Alternatives

Additional context

The text was updated successfully, but these errors were encountered:

shuyingsunshine21 added feature Is an improvement or enhancement help wanted Open to be worked on labels May 10, 2021

shuyingsunshine21 mentioned this issue May 10, 2021

Accelerator model state dict #7474

Merged

11 tasks

SeanNaren added the checkpointing Related to checkpointing label May 10, 2021

SeanNaren modified the milestones: v1.3.x, v1.4 May 10, 2021

SeanNaren closed this as completed in #7474 May 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

API change, expose model's state_dict to `accelerator.training_type_plugin` #7470

API change, expose model's state_dict to `accelerator.training_type_plugin` #7470

shuyingsunshine21 commented May 10, 2021

API change, expose model's state_dict to accelerator.training_type_plugin #7470

API change, expose model's state_dict to accelerator.training_type_plugin #7470

Comments

shuyingsunshine21 commented May 10, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

API change, expose model's state_dict to `accelerator.training_type_plugin` #7470

API change, expose model's state_dict to `accelerator.training_type_plugin` #7470