Clip norm after scaler.unscale_ in native fp16 training #9599

del2z · 2021-09-19T07:56:27Z

🐛 Bug

I trained a large model using native amp, but the loss converged very slow. After a careful check of the backward and optimization code, I found the clip_gradients is executed right after backward, but scaler.unscale_ is conducted in pre_optimization_step.
According to the instruction of Pytorch, the order of clip and unscale should be exchanged. Currently gradient_clip_val may lead to a very flat learning curve if used together with native amp.
Hope to be fixed.

The text was updated successfully, but these errors were encountered:

justusschock · 2021-09-20T08:57:26Z

Hey @del2z ,

thanks for letting us know. I just checked and it seems you are right. I'll get to work on this ASAP :)

cowwoc · 2021-10-03T23:27:33Z

I believe this is fixed in release 1.4.9: https://github.com/PyTorchLightning/pytorch-lightning/releases/tag/1.4.9

Can this issue be closed?

del2z added bug Something isn't working help wanted Open to be worked on labels Sep 19, 2021

justusschock self-assigned this Sep 20, 2021

justusschock added the priority: 0 High priority task label Sep 20, 2021

justusschock added this to the v1.4.x milestone Sep 20, 2021

justusschock mentioned this issue Sep 20, 2021

[Bugfix] Fix location of unscale in mixed precision plugin #9606

Merged

12 tasks

del2z closed this as completed Oct 4, 2021

carmocca mentioned this issue Oct 25, 2021

Fix gradient norm tracking and gradient clipping #9287

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clip norm after scaler.unscale_ in native fp16 training #9599

Clip norm after scaler.unscale_ in native fp16 training #9599

del2z commented Sep 19, 2021

justusschock commented Sep 20, 2021

Uh oh!

cowwoc commented Oct 3, 2021

Uh oh!

Clip norm after scaler.unscale_ in native fp16 training #9599

Clip norm after scaler.unscale_ in native fp16 training #9599

Comments

del2z commented Sep 19, 2021

🐛 Bug

justusschock commented Sep 20, 2021

Uh oh!

cowwoc commented Oct 3, 2021

Uh oh!