[Feat] add tiny Autoencoder for (almost) instant decoding #4384

sayakpaul · 2023-07-31T08:17:30Z

Considerations

I decided to NOT modify the AutoencoderKL class for this one because:

The tiny Autoencoder is fundamentally trained differently. Refer to https://github.com/madebyollin/taesd for more details.
Very minimal resemblance to how we structure the blocks in AutoencoderKL. This tiny Autoencoder just has a bunch of conv + relu blocks with residual connections.

Results

SD

import torch
from diffusers import DiffusionPipeline, AutoencoderTiny

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base", torch_dtype=torch.float16
)
pipe.vae = AutoencoderTiny.from_pretrained("sayakpaul/taesd-diffusers", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "slice of delicious New York-style berry cheesecake"
image = pipe(
    prompt, num_inference_steps=25, height=512, width=512, guidance_scale=3.0,
    generator=torch.manual_seed(0)
).images[0]
image

Comparison

Tiny AE	Original AE

SDXL

from diffusers import DiffusionPipeline, TinyAutoencoder
import torch

pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
)
pipeline.vae = TinyAutoencoder.from_pretrained("sayakpaul/taesdxl-diffusers", torch_dtype=torch.float16)
pipeline = pipeline.to("cuda")

prompt = "slice of delicious New York-style berry cheesecake"
image = pipeline(
    prompt, num_inference_steps=25, guidance_scale=3.0,
    generator=torch.manual_seed(0)
).images[0]
image

Tiny AE	Original AE

Todos

Add tests (will do once I get an initial review)
Add docs
Transfer sayakpaul/taesd-diffusers and sayakpaul/taesdxl-diffusers to madebyollin/taesd and madebyollin/taesdxl respectively.
Can apply the changes (needed to accommodate for the tiny AE) in the affected pipelines such SD img-to-img, etc.

HuggingFaceDocBuilderDev · 2023-07-31T08:24:40Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/image_processor.py

src/diffusers/models/tiny_autoencoder.py

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

madebyollin · 2023-07-31T14:55:08Z

src/diffusers/models/tiny_autoencoder.py

+    def encode(
+        self, x: torch.FloatTensor, return_dict: bool = True
+    ) -> Union[TinyAutoencoderOutput, Tuple[torch.FloatTensor]]:
+        output = self.encoder(x)


If the input images are assumed to be in [-1, 1] convention, I suspect this needs to be self.encoder(x.mul(0.5).add_(0.5))

We call the encode() method usually from training script where we ALWAYS ensure that the image pixel values are appropriately scaled :)

@sayakpaul after my hellscape i've been through with normalisation i'd love for this to be fixed deeper in

scripts/convert_tiny_autoencoder_to_diffusers.py

yiyixuxu

cool!
nice New york cheese cakes 😍😍😍

src/diffusers/models/tiny_autoencoder.py

src/diffusers/image_processor.py

sayakpaul · 2023-08-02T06:22:15Z

@stevhliu could you check the doc part of the PR and let me know your thoughts?

patrickvonplaten · 2023-08-02T09:31:14Z

tests/models/test_models_vae.py

+        return init_dict, inputs_dict
+
+    def test_outputs_equivalence(self):
+        pass


can we also add one integration test where we test that a latent vector is correctly decoded to an image?

Added in d7ab16f.

patrickvonplaten

Looks good to me! Would be nice to add at least one integration test

sayakpaul · 2023-08-02T14:30:23Z

@patrickvonplaten I have added an integration test. Works for you?

@madebyollin I know you're AFK, but as an FYI, I have opened the following:

Once these two PRs are merged, we will merge this PR.

patrickvonplaten

Thanks for adding the tests!

stevhliu

Nice and clear docstrings! 👏

A small detail, but I found it a little curious that the model is called Tiny AutoEncoder, but its API is AutoencoderTiny. Not a big deal, but I think it makes it easier to refer to when they have the same names (ie you don't have to mentally convert Tiny AutoEncoder to AutoencoderTiny).

docs/source/en/api/models/autoencoder_tiny.md

src/diffusers/models/autoencoder_tiny.py

docs/source/en/api/models/autoencoder_tiny.md

src/diffusers/models/autoencoder_tiny.py

stevhliu · 2023-08-02T17:58:12Z

src/diffusers/models/autoencoder_tiny.py

+        if isinstance(module, (EncoderTiny, DecoderTiny)):
+            module.gradient_checkpointing = value
+
+    def scale_latents(self, x):


I think a sentence explaining what this function and unscale_latents does would be clearer.

Since it's not a highlighted method from the docs, I think it's fine as is for now.

docs/source/en/_toctree.yml

Co-authored-by: Steven Liu <[email protected]>

sayakpaul · 2023-08-02T18:08:20Z

A small detail, but I found it a little curious that the model is called Tiny AutoEncoder, but its API is AutoencoderTiny.

AutoencoderTiny was done to be in line with AutoencoderKL.

…e#4384) * add: model implementation of tiny autoencoder. * add: inits. * push the latest devs. * add: conversion script and finish. * add: scaling factor args. * debugging * fix denormalization. * fix: positional argument. * handle use_torch_2_0_or_xformers. * handle post_quant_conv * handle dtype * fix: sdxl image processor for tiny ae. * fix: sdxl image processor for tiny ae. * unify upcasting logic. * copied from madness. * remove trailing whitespace. * set is_tiny_vae = False * address PR comments. * change to AutoencoderTiny * make act_fn an str throughout * fix: apply_forward_hook decorator call * get rid of the special is_tiny_vae flag. * directly scale the output. * fix dummies? * fix: act_fn. * get rid of the Clamp() layer. * bring back copied from. * movement of the blocks to appropriate modules. * add: docstrings to AutoencoderTiny * add: documentation. * changes to the conversion script. * add doc entry. * settle tests. * style * add one slow test. * fix * fix 2 * fix 2 * fix: 4 * fix: 5 * finish integration tests * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * style --------- Co-authored-by: Steven Liu <[email protected]>

sayakpaul added 15 commits July 26, 2023 11:52

add: model implementation of tiny autoencoder.

c15d27a

add: inits.

937eb19

push the latest devs.

19f834e

Merge branch 'main' into feat/tiny-autoenc

58b9bc9

add: conversion script and finish.

cddf45d

add: scaling factor args.

f111387

debugging

258c074

fix denormalization.

8be9d3d

fix: positional argument.

63a95a5

handle use_torch_2_0_or_xformers.

b470897

handle post_quant_conv

2ce4bf3

handle dtype

c8ff8e5

fix: sdxl image processor for tiny ae.

0c099ca

fix: sdxl image processor for tiny ae.

8d43bc8

unify upcasting logic.

7d2dca0

sayakpaul requested review from yiyixuxu and patrickvonplaten July 31, 2023 08:17

sayakpaul added 2 commits July 31, 2023 13:57

copied from madness.

a7a8f6f

remove trailing whitespace.

d11fd65

madebyollin reviewed Jul 31, 2023

View reviewed changes

psychedelicious mentioned this pull request Aug 1, 2023

[enhancement]: Support TAESD to display better progress image invoke-ai/InvokeAI#4056

Open

1 task

sayakpaul added 2 commits August 1, 2023 10:39

set is_tiny_vae = False

92c72e2

address PR comments.

d6808d1

yiyixuxu approved these changes Aug 1, 2023

View reviewed changes

sayakpaul added 5 commits August 1, 2023 13:12

change to AutoencoderTiny

5b7635b

make act_fn an str throughout

4bd124d

fix: apply_forward_hook decorator call

4ddc077

get rid of the special is_tiny_vae flag.

ee35ebd

directly scale the output.

4a606ef

sayakpaul added 9 commits August 2, 2023 09:25

bring back copied from.

0b04133

movement of the blocks to appropriate modules.

13f06fa

add: docstrings to AutoencoderTiny

ef8772a

add: documentation.

8c0852c

changes to the conversion script.

6f17fd3

add doc entry.

29e3370

settle tests.

817bb2b

Merge branch 'main' into feat/tiny-autoenc

2995de1

style

bbf0597

patrickvonplaten reviewed Aug 2, 2023

View reviewed changes

patrickvonplaten approved these changes Aug 2, 2023

View reviewed changes

sayakpaul added 7 commits August 2, 2023 18:58

add one slow test.

9105fb5

fix

ef3eae2

fix 2

e05b730

fix 2

0980796

fix: 4

163c035

fix: 5

dd0b673

finish integration tests

d7ab16f

patrickvonplaten approved these changes Aug 2, 2023

View reviewed changes

stevhliu approved these changes Aug 2, 2023

View reviewed changes

Apply suggestions from code review

644d125

Co-authored-by: Steven Liu <[email protected]>

style

e2fcccb

sayakpaul merged commit 18fc40c into main Aug 2, 2023

sayakpaul deleted the feat/tiny-autoenc branch August 2, 2023 18:28

madebyollin mentioned this pull request Aug 19, 2023

TAESD-encoded latents are too dark #4676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] add tiny Autoencoder for (almost) instant decoding #4384

[Feat] add tiny Autoencoder for (almost) instant decoding #4384

sayakpaul commented Jul 31, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 31, 2023 •

edited

Loading

madebyollin Jul 31, 2023

sayakpaul Aug 1, 2023

bghira Aug 20, 2023

yiyixuxu left a comment

sayakpaul commented Aug 2, 2023

patrickvonplaten Aug 2, 2023

sayakpaul Aug 2, 2023

patrickvonplaten left a comment

sayakpaul commented Aug 2, 2023

patrickvonplaten left a comment

stevhliu left a comment

stevhliu Aug 2, 2023

sayakpaul Aug 2, 2023

sayakpaul commented Aug 2, 2023

[Feat] add tiny Autoencoder for (almost) instant decoding #4384

[Feat] add tiny Autoencoder for (almost) instant decoding #4384

Conversation

sayakpaul commented Jul 31, 2023 • edited Loading

Considerations

Results

SD

Comparison

SDXL

Todos

HuggingFaceDocBuilderDev commented Jul 31, 2023 • edited Loading

madebyollin Jul 31, 2023

Choose a reason for hiding this comment

sayakpaul Aug 1, 2023

Choose a reason for hiding this comment

bghira Aug 20, 2023

Choose a reason for hiding this comment

yiyixuxu left a comment

Choose a reason for hiding this comment

sayakpaul commented Aug 2, 2023

patrickvonplaten Aug 2, 2023

Choose a reason for hiding this comment

sayakpaul Aug 2, 2023

Choose a reason for hiding this comment

patrickvonplaten left a comment

Choose a reason for hiding this comment

sayakpaul commented Aug 2, 2023

patrickvonplaten left a comment

Choose a reason for hiding this comment

stevhliu left a comment

Choose a reason for hiding this comment

stevhliu Aug 2, 2023

Choose a reason for hiding this comment

sayakpaul Aug 2, 2023

Choose a reason for hiding this comment

sayakpaul commented Aug 2, 2023

sayakpaul commented Jul 31, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 31, 2023 •

edited

Loading