Skip to content

Commit ecfa60b

Browse files
committed
2 parents bd965d9 + c3614f1 commit ecfa60b

File tree

106 files changed

+1602
-789
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+1602
-789
lines changed

.deepsource.toml

Lines changed: 0 additions & 26 deletions
This file was deleted.

CHANGELOG.md

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
6464
* Allow registering custom optimizers and learning rate schedulers without subclassing the CLI ([#9565](https://github.com/PyTorchLightning/pytorch-lightning/pull/9565))
6565
* Support shorthand notation to instantiate optimizers and learning rate schedulers ([#9565](https://github.com/PyTorchLightning/pytorch-lightning/pull/9565))
6666
* Support passing lists of callbacks via command line ([#8815](https://github.com/PyTorchLightning/pytorch-lightning/pull/8815))
67+
* Support shorthand notation to instantiate models ([#9588](https://github.com/PyTorchLightning/pytorch-lightning/pull/9588))
68+
* Support shorthand notation to instantiate datamodules ([#10011](https://github.com/PyTorchLightning/pytorch-lightning/pull/10011))
6769

6870

6971
- Fault-tolerant training:
@@ -193,24 +195,35 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
193195
- Added `strategy` argument to Trainer ([#8597](https://github.com/PyTorchLightning/pytorch-lightning/pull/8597))
194196

195197

198+
- Added `init_meta_context`, `materialize_module` utilities ([#9920](https://github.com/PyTorchLightning/pytorch-lightning/pull/9920))
199+
200+
196201
- Added `TPUPrecisionPlugin` ([#10020](https://github.com/PyTorchLightning/pytorch-lightning/pull/#10020))
197202

198203

199204
- `torch.bfloat16` support:
200205
* Added bfloat16 support for Lightning Trainer ([#9049](https://github.com/PyTorchLightning/pytorch-lightning/pull/9049))
201206
* Renamed `TPUHalfPrecisionPlugin` to `TPUBf16PrecisionPlugin` ([#10026](https://github.com/PyTorchLightning/pytorch-lightning/pull/10026))
202-
207+
* Default to `precision=bf16` on CPU when `precision=16` is passed ([#10033](https://github.com/PyTorchLightning/pytorch-lightning/pull/10033))
203208

204209

205210
- Added `kfold` example for loop customization ([#9965](https://github.com/PyTorchLightning/pytorch-lightning/pull/9965))
206211

207212

208213
- LightningLite:
209214
* Added `PrecisionPlugin.forward_context`, making it the default implementation for all `{train,val,test,predict}_step_context()` methods ([#9988](https://github.com/PyTorchLightning/pytorch-lightning/pull/9988))
210-
* Added `DDPSpawnPlugin.spawn()` for spawning new processes of a given function ([#10018](https://github.com/PyTorchLightning/pytorch-lightning/pull/10018))
215+
* Added `DDPSpawnPlugin.spawn()` for spawning new processes of a given function ([#10018](https://github.com/PyTorchLightning/pytorch-lightning/pull/10018), [#10022](https://github.com/PyTorchLightning/pytorch-lightning/pull/10022))
211216
* Added `TrainingTypePlugin.{_setup_model, _setup_optimizer}` methods ([#9994](https://github.com/PyTorchLightning/pytorch-lightning/pull/9994))
212217
* Implemented `DataParallelPlugin._setup_model` ([#10010](https://github.com/PyTorchLightning/pytorch-lightning/pull/10010))
213218
* Implemented `DeepSpeedPlugin._setup_models_and_optimizers` ([#10009](https://github.com/PyTorchLightning/pytorch-lightning/pull/10009))
219+
* Implemented `{DDPShardedPlugin,DDPShardedSpawnPlugin}._setup_models_and_optimizers` ([#10028](https://github.com/PyTorchLightning/pytorch-lightning/pull/10028))
220+
* Added optional `model` argument to the `optimizer_step` methods in accelerators and plugins ([#10023](https://github.com/PyTorchLightning/pytorch-lightning/pull/10023))
221+
222+
223+
224+
- Added `XLACheckpointIO` plugin ([#9972](https://github.com/PyTorchLightning/pytorch-lightning/pull/9972))
225+
226+
214227

215228
### Changed
216229

@@ -508,6 +521,12 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
508521
- Remove deprecated `distributed_backend` from `Trainer` ([#10017](https://github.com/PyTorchLightning/pytorch-lightning/pull/10017))
509522

510523

524+
- Removed `process_idx` from the `{DDPSpawnPlugin,TPUSpawnPlugin}.new_process` methods ([#10022](https://github.com/PyTorchLightning/pytorch-lightning/pull/10022))
525+
526+
527+
- Removed automatic patching of `{train,val,test,predict}_dataloader()` on the `LightningModule` ([#9764](https://github.com/PyTorchLightning/pytorch-lightning/pull/9764))
528+
529+
511530
### Fixed
512531

513532

@@ -553,6 +572,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
553572
- Fixed `broadcast` in `DDPPlugin` and ``DDPSpawnPlugin` to respect the `src` input ([#9691](https://github.com/PyTorchLightning/pytorch-lightning/pull/9691))
554573

555574

575+
- Fixed `self.log(on_epoch=True, reduce_fx=sum))` for the `on_batch_start` and `on_train_batch_start` hooks ([#9791(https://github.com/PyTorchLightning/pytorch-lightning/pull/9791))
576+
577+
556578
- Fixed `self.log(on_epoch=True)` for the `on_batch_start` and `on_train_batch_start` hooks ([#9780](https://github.com/PyTorchLightning/pytorch-lightning/pull/9780))
557579

558580

@@ -585,6 +607,12 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
585607
- Fixed `train_dataloader` getting loaded twice when resuming from a checkpoint during `Trainer.fit()` ([#9671](https://github.com/PyTorchLightning/pytorch-lightning/pull/9671))
586608

587609

610+
- Fixed `LearningRateMonitor` logging with multiple param groups optimizer with no scheduler ([#10044](https://github.com/PyTorchLightning/pytorch-lightning/pull/10044))
611+
612+
613+
614+
- Fixed undesired side effects being caused by `Trainer` patching dataloader methods on the `LightningModule` ([#9764](https://github.com/PyTorchLightning/pytorch-lightning/pull/9764))
615+
588616

589617
## [1.4.9] - 2021-09-30
590618

docs/source/advanced/advanced_gpu.rst

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,9 @@ To use Sharded Training, you need to first install FairScale using the command b
7171
.. code-block:: python
7272
7373
# train using Sharded DDP
74-
trainer = Trainer(plugins="ddp_sharded")
74+
trainer = Trainer(strategy="ddp_sharded")
7575
76-
Sharded Training can work across all DDP variants by adding the additional ``--plugins ddp_sharded`` flag.
76+
Sharded Training can work across all DDP variants by adding the additional ``--strategy ddp_sharded`` flag.
7777

7878
Internally we re-initialize your optimizers and shard them across your machines and processes. We handle all communication using PyTorch distributed, so no code changes are required.
7979

@@ -156,7 +156,7 @@ Below is an example of using both ``wrap`` and ``auto_wrap`` to create your mode
156156
157157
158158
model = MyModel()
159-
trainer = Trainer(gpus=4, plugins="fsdp", precision=16)
159+
trainer = Trainer(gpus=4, strategy="fsdp", precision=16)
160160
trainer.fit(model)
161161
162162
trainer.test()
@@ -248,7 +248,7 @@ It is recommended to skip Stage 1 and use Stage 2, which comes with larger memor
248248
from pytorch_lightning import Trainer
249249
250250
model = MyModel()
251-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_1", precision=16)
251+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_1", precision=16)
252252
trainer.fit(model)
253253
254254
@@ -265,7 +265,7 @@ As a result, benefits can also be seen on a single GPU. Do note that the default
265265
from pytorch_lightning import Trainer
266266
267267
model = MyModel()
268-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_2", precision=16)
268+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_2", precision=16)
269269
trainer.fit(model)
270270
271271
.. code-block:: bash
@@ -286,7 +286,7 @@ Below we show an example of running `ZeRO-Offload <https://www.deepspeed.ai/tuto
286286
from pytorch_lightning.plugins import DeepSpeedPlugin
287287
288288
model = MyModel()
289-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_2_offload", precision=16)
289+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_2_offload", precision=16)
290290
trainer.fit(model)
291291
292292
@@ -307,7 +307,7 @@ You can also modify the ZeRO-Offload parameters via the plugin as below.
307307
model = MyModel()
308308
trainer = Trainer(
309309
gpus=4,
310-
plugins=DeepSpeedPlugin(offload_optimizer=True, allgather_bucket_size=5e8, reduce_bucket_size=5e8),
310+
strategy=DeepSpeedPlugin(offload_optimizer=True, allgather_bucket_size=5e8, reduce_bucket_size=5e8),
311311
precision=16,
312312
)
313313
trainer.fit(model)
@@ -340,7 +340,7 @@ For even more speed benefit, DeepSpeed offers an optimized CPU version of ADAM c
340340
341341
342342
model = MyModel()
343-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_2_offload", precision=16)
343+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_2_offload", precision=16)
344344
trainer.fit(model)
345345
346346
@@ -383,7 +383,7 @@ Also please have a look at our :ref:`deepspeed-zero-stage-3-tips` which contains
383383
384384
385385
model = MyModel()
386-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_3", precision=16)
386+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_3", precision=16)
387387
trainer.fit(model)
388388
389389
trainer.test()
@@ -403,7 +403,7 @@ You can also use the Lightning Trainer to run predict or evaluate with DeepSpeed
403403
404404
405405
model = MyModel()
406-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_3", precision=16)
406+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_3", precision=16)
407407
trainer.test(ckpt_path="my_saved_deepspeed_checkpoint.ckpt")
408408
409409
@@ -438,7 +438,7 @@ This reduces the time taken to initialize very large models, as well as ensure w
438438
439439
440440
model = MyModel()
441-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_3", precision=16)
441+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_3", precision=16)
442442
trainer.fit(model)
443443
444444
trainer.test()
@@ -463,14 +463,14 @@ DeepSpeed ZeRO Stage 3 Offloads optimizer state, gradients to the host CPU to re
463463
464464
# Enable CPU Offloading
465465
model = MyModel()
466-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_3_offload", precision=16)
466+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_3_offload", precision=16)
467467
trainer.fit(model)
468468
469469
# Enable CPU Offloading, and offload parameters to CPU
470470
model = MyModel()
471471
trainer = Trainer(
472472
gpus=4,
473-
plugins=DeepSpeedPlugin(
473+
strategy=DeepSpeedPlugin(
474474
stage=3,
475475
offload_optimizer=True,
476476
offload_parameters=True,
@@ -492,14 +492,14 @@ Additionally, DeepSpeed supports offloading to NVMe drives for even larger model
492492
493493
# Enable CPU Offloading
494494
model = MyModel()
495-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_3_offload", precision=16)
495+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_3_offload", precision=16)
496496
trainer.fit(model)
497497
498498
# Enable CPU Offloading, and offload parameters to CPU
499499
model = MyModel()
500500
trainer = Trainer(
501501
gpus=4,
502-
plugins=DeepSpeedPlugin(
502+
strategy=DeepSpeedPlugin(
503503
stage=3,
504504
offload_optimizer=True,
505505
offload_parameters=True,
@@ -576,12 +576,12 @@ This saves memory when training larger models, however requires using a checkpoi
576576
model = MyModel()
577577
578578
579-
trainer = Trainer(gpus=4, plugins="deepspeed_stage_3_offload", precision=16)
579+
trainer = Trainer(gpus=4, strategy="deepspeed_stage_3_offload", precision=16)
580580
581581
# Enable CPU Activation Checkpointing
582582
trainer = Trainer(
583583
gpus=4,
584-
plugins=DeepSpeedPlugin(
584+
strategy=DeepSpeedPlugin(
585585
stage=3,
586586
offload_optimizer=True, # Enable CPU Offloading
587587
cpu_checkpointing=True, # (Optional) offload activations to CPU
@@ -670,7 +670,7 @@ In some cases you may want to define your own DeepSpeed Config, to access all pa
670670
}
671671
672672
model = MyModel()
673-
trainer = Trainer(gpus=4, plugins=DeepSpeedPlugin(deepspeed_config), precision=16)
673+
trainer = Trainer(gpus=4, strategy=DeepSpeedPlugin(deepspeed_config), precision=16)
674674
trainer.fit(model)
675675
676676
@@ -682,7 +682,7 @@ We support taking the config as a json formatted file:
682682
from pytorch_lightning.plugins import DeepSpeedPlugin
683683
684684
model = MyModel()
685-
trainer = Trainer(gpus=4, plugins=DeepSpeedPlugin("/path/to/deepspeed_config.json"), precision=16)
685+
trainer = Trainer(gpus=4, strategy=DeepSpeedPlugin("/path/to/deepspeed_config.json"), precision=16)
686686
trainer.fit(model)
687687
688688
@@ -717,7 +717,7 @@ This can reduce peak memory usage and throughput as saved memory will be equal t
717717
from pytorch_lightning.plugins import DDPPlugin
718718
719719
model = MyModel()
720-
trainer = Trainer(gpus=4, plugins=DDPPlugin(gradient_as_bucket_view=True))
720+
trainer = Trainer(gpus=4, strategy=DDPPlugin(gradient_as_bucket_view=True))
721721
trainer.fit(model)
722722
723723
DDP Communication Hooks
@@ -740,7 +740,7 @@ Enable `FP16 Compress Hook for multi-node throughput improvement <https://pytorc
740740
)
741741
742742
model = MyModel()
743-
trainer = Trainer(gpus=4, plugins=DDPPlugin(ddp_comm_hook=default.fp16_compress_hook))
743+
trainer = Trainer(gpus=4, strategy=DDPPlugin(ddp_comm_hook=default.fp16_compress_hook))
744744
trainer.fit(model)
745745
746746
Enable `PowerSGD for multi-node throughput improvement <https://pytorch.org/docs/stable/ddp_comm_hooks.html#powersgd-communication-hook>`__:
@@ -758,7 +758,7 @@ Enable `PowerSGD for multi-node throughput improvement <https://pytorch.org/docs
758758
model = MyModel()
759759
trainer = Trainer(
760760
gpus=4,
761-
plugins=DDPPlugin(
761+
strategy=DDPPlugin(
762762
ddp_comm_state=powerSGD.PowerSGDState(
763763
process_group=None,
764764
matrix_approximation_rank=1,
@@ -787,7 +787,7 @@ Combine hooks for accumulated benefit:
787787
model = MyModel()
788788
trainer = Trainer(
789789
gpus=4,
790-
plugins=DDPPlugin(
790+
strategy=DDPPlugin(
791791
ddp_comm_state=powerSGD.PowerSGDState(
792792
process_group=None,
793793
matrix_approximation_rank=1,

docs/source/advanced/ipu.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ IPUs provide further optimizations to speed up training. By using the ``IPUPlugi
8383
from pytorch_lightning.plugins import IPUPlugin
8484
8585
model = MyLightningModule()
86-
trainer = pl.Trainer(ipus=8, plugins=IPUPlugin(device_iterations=32))
86+
trainer = pl.Trainer(ipus=8, strategy=IPUPlugin(device_iterations=32))
8787
trainer.fit(model)
8888
8989
Note that by default we return the last device iteration loss. You can override this by passing in your own ``poptorch.Options`` and setting the AnchorMode as described in the `PopTorch documentation <https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/reference.html#poptorch.Options.anchorMode>`__.
@@ -102,7 +102,7 @@ Note that by default we return the last device iteration loss. You can override
102102
training_opts.anchorMode(poptorch.AnchorMode.All)
103103
training_opts.deviceIterations(32)
104104
105-
trainer = Trainer(ipus=8, plugins=IPUPlugin(inference_opts=inference_opts, training_opts=training_opts))
105+
trainer = Trainer(ipus=8, strategy=IPUPlugin(inference_opts=inference_opts, training_opts=training_opts))
106106
trainer.fit(model)
107107
108108
You can also override all options by passing the ``poptorch.Options`` to the plugin. See `PopTorch options documentation <https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/batching.html>`__ for more information.
@@ -124,7 +124,7 @@ Lightning supports dumping all reports to a directory to open using the tool.
124124
from pytorch_lightning.plugins import IPUPlugin
125125
126126
model = MyLightningModule()
127-
trainer = pl.Trainer(ipus=8, plugins=IPUPlugin(autoreport_dir="report_dir/"))
127+
trainer = pl.Trainer(ipus=8, strategy=IPUPlugin(autoreport_dir="report_dir/"))
128128
trainer.fit(model)
129129
130130
This will dump all reports to ``report_dir/`` which can then be opened using the Graph Analyser Tool, see `Opening Reports <https://docs.graphcore.ai/projects/graphcore-popvision-user-guide/en/latest/graph/graph.html#opening-reports>`__.
@@ -174,7 +174,7 @@ Below is an example using the block annotation in a LightningModule.
174174
175175
176176
model = MyLightningModule()
177-
trainer = pl.Trainer(ipus=8, plugins=IPUPlugin(device_iterations=20))
177+
trainer = pl.Trainer(ipus=8, strategy=IPUPlugin(device_iterations=20))
178178
trainer.fit(model)
179179
180180
@@ -217,7 +217,7 @@ You can also use the block context manager within the forward function, or any o
217217
218218
219219
model = MyLightningModule()
220-
trainer = pl.Trainer(ipus=8, plugins=IPUPlugin(device_iterations=20))
220+
trainer = pl.Trainer(ipus=8, strategy=IPUPlugin(device_iterations=20))
221221
trainer.fit(model)
222222
223223

docs/source/advanced/mixed_precision.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,14 +50,14 @@ BFloat16 Mixed precision is similar to FP16 mixed precision, however we maintain
5050
Since BFloat16 is more stable than FP16 during training, we do not need to worry about any gradient scaling or nan gradient values that comes with using FP16 mixed precision.
5151

5252
.. testcode::
53-
:skipif: not _TORCH_BFLOAT_AVAILABLE
53+
:skipif: not _TORCH_GREATER_EQUAL_DEV_1_10 or not torch.cuda.is_available()
5454

5555
Trainer(gpus=1, precision="bf16")
5656

5757
It is also possible to use BFloat16 mixed precision on the CPU, relying on MKLDNN under the hood.
5858

5959
.. testcode::
60-
:skipif: not _TORCH_CPU_AMP_AVAILABLE
60+
:skipif: not _TORCH_GREATER_EQUAL_DEV_1_10
6161

6262
Trainer(precision="bf16")
6363

0 commit comments

Comments
 (0)