Stable-Diffusion-Inpainting: Training Pipeline V1.5, V2 #6922

cryptexis · 2024-02-09T12:09:33Z

What does this PR do?

This functionality allows training/fine-tuning of the 9 channel inpainting models provided by

This is due to noticing that many inpainting models provided to the community e.g. on https://civitai.com/ have unets with 4 input channels. 4 channel models may lack capacity and eventually quality in the inpainting tasks. To support the community to develop fully fledged inpainting models I have modified the text_to_image training pipeline to do inpainting.

Additions:

Added random masking strategy (squares) during the training, center crop during validation
Take first 3 images of the pokemon dataset as validation set

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sayakpaul and @patrickvonplaten

Examples Out of Training Distribution Scenery:

Prompt: a drawing of a green pokemon with red eyes

Pre-trained

Fine-tuned

Prompt: a green and yellow toy with a red nose

Pre-trained

Fine-tuned

Prompt: a red and white ball with an angry look on its face

Pre-trained

Fine-tuned

cryptexis · 2024-02-14T23:28:13Z

hi @patil-suraj @sayakpaul, was wondering if this is something interesting for you to look into ? Feedback is appreciated

yiyixuxu · 2024-02-16T03:33:42Z

cool!
gentle pin @patil-suraj

drhead · 2024-02-19T00:25:07Z

I've experimented with finetuning proper inpainting models before. I strongly urge you to read the LAMA paper (https://arxiv.org/pdf/2109.07161.pdf) and implement their masking strategy (which is what is used by the stable-diffusion-inpainting checkpoint). I used a very simple masking strategy like what you had for a long time and never got satisfactory results with my model until switching to the LAMA masking strategy. Training on simple white square masks will severely degrade the performance of the pretrained SD inpainting model.

examples/inpainting/train_inpainting.py

sayakpaul · 2024-02-19T01:27:37Z

examples/inpainting/train_inpainting.py

+
+        if args.push_to_hub:
+            repo_id = create_repo(
+                repo_id=args.hub_model_id or Path(args.output_dir).name, exist_ok=True, token=args.hub_token


Let's make sure to follow:

diffusers/examples/text_to_image/train_text_to_image.py

Line 499 in 31de879

if args.report_to == "wandb" and args.hub_token is not None:

Otherwise, hub_token will be compromised on wandb run page.

Seems like this comment wasn't addressed?

examples/inpainting/train_inpainting.py

HuggingFaceDocBuilderDev · 2024-02-19T01:30:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

examples/inpainting/README.md

sayakpaul

Left some initial comments. Looking quite nice.

I do think having an option to enable LAMA like making might be a very good reference point as our training scripts are quite widely referenced.

And I apologize for the delay.

cryptexis · 2024-02-19T08:05:21Z

I've experimented with finetuning proper inpainting models before. I strongly urge you to read the LAMA paper (https://arxiv.org/pdf/2109.07161.pdf) and implement their masking strategy (which is what is used by the stable-diffusion-inpainting checkpoint). I used a very simple masking strategy like what you had for a long time and never got satisfactory results with my model until switching to the LAMA masking strategy. Training on simple white square masks will severely degrade the performance of the pretrained SD inpainting model.

@sayakpaul

I thought having the most simple implementation would do. And then the user can decide which masking strategy to use actually. Sure will add that, if that's a deal breaker

cryptexis · 2024-02-26T10:31:48Z

@sayakpaul I have adapted masking strategy from LAMA paper on my local branch. I have a question, is it according to guidelines to have a config file properties for the masking separately, like here:
https://github.com/advimman/lama/blob/main/configs/training/data/abl-04-256-mh-dist-celeba.yaml#L10 ?

I feel it is a bit extensive and confusing to make all of those property values as part of CLI arguments, might clutter and confuse - which arguments are model specific and which ones are data specific.

sayakpaul · 2024-02-26T16:07:53Z

I feel it is a bit extensive and confusing to make all of those property values as part of CLI arguments, might clutter and confuse - which arguments are model specific and which ones are data specific.

You are absolutely correct. What we can do is include a note about the masking strategy in the README and link to your implementation. Does that sound good?

sayakpaul

Looking really nice now. I will let @patil-suraj review this too.

sayakpaul · 2024-03-02T08:29:43Z

examples/inpainting/train_inpainting.py

+        prompt = batch["prompts"][0]
+
+        with torch.autocast("cuda"):
+            #### UPDATE PIPELINE HERE


Does this command need to be removed?

which one ?

"#### UPDATE PIPELINE HERE"

examples/inpainting/train_inpainting.py

sayakpaul · 2024-03-02T08:31:43Z

examples/inpainting/train_inpainting.py

+
+        if args.push_to_hub:
+            repo_id = create_repo(
+                repo_id=args.hub_model_id or Path(args.output_dir).name, exist_ok=True, token=args.hub_token


Seems like this comment wasn't addressed?

examples/inpainting/train_inpainting.py

sayakpaul · 2024-03-02T08:34:14Z

examples/inpainting/train_inpainting.py

+                init_image = image_transform(batch["pixel_values"][0])
+                prompt = batch["prompts"][0]
+
+                with torch.autocast("cuda"):


Let's make use of the log_validation() function here and log the results to wandb as well. You can refer to https://github.com/huggingface/diffusers/blob/main/examples/controlnet/train_controlnet.py for implementing this. But let me know if you need some more clarifications.

sayakpaul · 2024-03-02T08:37:05Z

I think we also need to add a test case here.

cryptexis · 2024-03-02T12:22:28Z

@sayakpaul I think it's a github glitch. :) to the extent that I cannot reply you there.

https://github.com/cryptexis/diffusers/blob/sd_15_inpainting/examples/inpainting/train_inpainting.py#L771 - in my repo I do not have anything similar to it under those lines. And the piece of code you're referring to is here.

cryptexis · 2024-03-02T12:32:19Z

I think we also need to add a test case here.

I see a lot of https://huggingface.co/hf-internal-testing is used in the testing. Are usual mortals able to add unit tests ?

cryptexis · 2024-03-02T12:51:44Z

Examples Training with Random Masking

Inference with Square Mask (as before)

Prompt: a drawing of a green pokemon with red eyes

pre-trained stable-diffusion-inpainting

fine-tuned stable-diffusion-inpainting

pre-trained stable-diffusion-v1-5

fine-tuned stable-diffusion-v1-5 (no inpainting)

fine-tuned stable-diffusion-v1-5 (inpainting)

Inference with Random Mask

pre-trained stable-diffusion-inpainting

fine-tuned stable-diffusion-inpainting

pre-trained stable-diffusion-v1-5

fine-tuned stable-diffusion-v1-5 (no inpainting)

fine-tuned stable-diffusion-v1-5 (inpainting)

sayakpaul

Looking good. I think the only that is pending now is the testing suite.

cryptexis · 2024-03-03T06:00:02Z

Looking good. I think the only that is pending now is the testing suite.

@sayakpaul worked yesterday on the tests. Hit a wall. Then tried to run tests for the text_to_image and hit the same wall:

attaching the screenshot:

Was wondering if it is a systematic issue across all tests....

sayakpaul · 2024-03-03T06:02:48Z

@sayakpaul worked yesterday on the tests. Hit a wall. Then tried to run tests for the text_to_image and hit the same wall:

Had it been the case, it would have been caught in the CI. The CI doesn't indicate so. Feel free to push the tests and then we can work towards fixing them. WDYT?

BTW, for fixing the code quality issues, we need to run make style && make quality from the root of diffusers.

…o sd_15_inpainting

cryptexis · 2024-03-03T06:16:45Z

@sayakpaul worked yesterday on the tests. Hit a wall. Then tried to run tests for the text_to_image and hit the same wall:

Had it been the case, it would have been caught in the CI. The CI doesn't indicate so. Feel free to push the tests and then we can work towards fixing them. WDYT?

BTW, for fixing the code quality issues, we need to run make style && make quality from the root of diffusers.

Done @sayakpaul , I think everything is addressed, tests are pushed. Thanks a lot for the patience, support and all the help!

crapthings · 2024-03-04T01:17:01Z

How to prepare dataset?

image
mask
prompt

sayakpaul · 2024-03-04T03:01:33Z

@cryptexis let's fix the example tests that are failing now.

Srinivasa-N707 · 2024-03-04T16:15:21Z

can anyone share script of sdxl inpainting fine tuning?

patil-suraj

Thanks a lot for working on this, the script looks great! Just left some nits.

For the runwayml inpainting model, during training they mask the whole image 25% of the time. Have you experimented with that ?

examples/inpainting/README.md

patil-suraj · 2024-03-11T09:27:58Z

examples/inpainting/requirements.txt

+ftfy
+tensorboard
+Jinja2
+peft==0.7.0


do we need peft for this example ?

examples/inpainting/train_inpainting.py

github-actions · 2024-04-04T15:03:16Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

cs-mshah · 2024-04-11T16:45:59Z

When is this getting merged?

Co-authored-by: Suraj Patil <[email protected]>

yiyixuxu · 2024-04-11T17:06:05Z

@cryptexis
can you

address the final comments here Stable-Diffusion-Inpainting: Training Pipeline V1.5, V2 #6922 (comment) - if peft is not used we can remove it; otherwise we are all good
make sure the tests pass

will merge once the tests pass!

zijinY · 2024-05-02T06:58:54Z

@Sanster Thanks for your plan, I also want to finetune an stable difffusion inpainting model for object removal. Have you tried this, how is the performance?

github-actions · 2024-09-14T15:19:06Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

fire2323 · 2025-02-11T02:13:04Z

Hi patil-suraj @patil-suraj , appreciated for the convenient script ! Is there any code example and dataset example to run the script: https://github.com/huggingface/diffusers/blob/inpainting-script/examples/inpainting/train_inpainting_sdxl.py ?

github-actions · 2025-03-08T15:09:01Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

vhakobyan added 4 commits February 9, 2024 09:32

wip: training script

2116de2

wip: update documentation

882cb67

fix: README

89854ee

fix: README title

969605f

sayakpaul requested a review from patil-suraj February 9, 2024 13:30

sayakpaul reviewed Feb 19, 2024

View reviewed changes

examples/inpainting/train_inpainting.py Show resolved Hide resolved

sayakpaul reviewed Feb 19, 2024

View reviewed changes

examples/inpainting/train_inpainting.py Outdated Show resolved Hide resolved

sayakpaul reviewed Feb 19, 2024

View reviewed changes

examples/inpainting/train_inpainting.py Outdated Show resolved Hide resolved

sayakpaul reviewed Feb 19, 2024

View reviewed changes

examples/inpainting/README.md Show resolved Hide resolved

sayakpaul reviewed Feb 19, 2024

View reviewed changes

Merge branch 'main' into sd_15_inpainting

18191cc

vhakobyan added 2 commits February 26, 2024 13:44

wip: integrating LAMA masking

69d4494

wip: merged commits

272dc87

yiyixuxu added the training label Feb 26, 2024

vhakobyan and others added 3 commits March 2, 2024 07:06

wip: final fixes

07c8fd1

wip: updating README

c1c3a0e

Merge branch 'main' into sd_15_inpainting

cd619ff

sayakpaul approved these changes Mar 2, 2024

View reviewed changes

sayakpaul reviewed Mar 3, 2024

View reviewed changes

Merge branch 'main' into sd_15_inpainting

5532dea

vhakobyan added 2 commits March 3, 2024 06:14

wip: fixing log_validation, tests

5dd28bd

Merge branch 'sd_15_inpainting' of github.com:cryptexis/diffusers int…

235655f

…o sd_15_inpainting

run quality

5179539

sayakpaul mentioned this pull request Mar 3, 2024

Fine tuning SDXL inpainting #4680

Closed

Merge branch 'main' into sd_15_inpainting

2d07574

patil-suraj approved these changes Mar 11, 2024

View reviewed changes

github-actions bot added the stale Issues that haven't received updates label Apr 4, 2024

yiyixuxu removed the stale Issues that haven't received updates label Apr 11, 2024

yiyixuxu and others added 4 commits April 11, 2024 07:04

Update examples/inpainting/README.md

8f33ed1

Co-authored-by: Suraj Patil <[email protected]>

Update examples/inpainting/train_inpainting.py

f2b04e3

Co-authored-by: Suraj Patil <[email protected]>

Update examples/inpainting/train_inpainting.py

7dc6bfb

Co-authored-by: Suraj Patil <[email protected]>

Update examples/inpainting/train_inpainting.py

d11619a

Co-authored-by: Suraj Patil <[email protected]>

github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024

github-actions bot removed the stale Issues that haven't received updates label Feb 11, 2025

github-actions bot added the stale Issues that haven't received updates label Mar 8, 2025

Stable-Diffusion-Inpainting: Training Pipeline V1.5, V2 #6922

Are you sure you want to change the base?

Stable-Diffusion-Inpainting: Training Pipeline V1.5, V2 #6922

Conversation

cryptexis commented Feb 9, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

Examples Out of Training Distribution Scenery:

Prompt: a drawing of a green pokemon with red eyes

Pre-trained

Fine-tuned

Prompt: a green and yellow toy with a red nose

Pre-trained

Fine-tuned

Prompt: a red and white ball with an angry look on its face

Pre-trained

Fine-tuned

cryptexis commented Feb 14, 2024

yiyixuxu commented Feb 16, 2024

drhead commented Feb 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Feb 19, 2024

sayakpaul left a comment

Choose a reason for hiding this comment

cryptexis commented Feb 19, 2024

cryptexis commented Feb 26, 2024

sayakpaul commented Feb 26, 2024

sayakpaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul commented Mar 2, 2024

cryptexis commented Mar 2, 2024

cryptexis commented Mar 2, 2024 • edited Loading

cryptexis commented Mar 2, 2024

Examples Training with Random Masking

Inference with Square Mask (as before)

Prompt: a drawing of a green pokemon with red eyes

pre-trained stable-diffusion-inpainting

fine-tuned stable-diffusion-inpainting

pre-trained stable-diffusion-v1-5

fine-tuned stable-diffusion-v1-5 (no inpainting)

fine-tuned stable-diffusion-v1-5 (inpainting)

Inference with Random Mask

pre-trained stable-diffusion-inpainting

fine-tuned stable-diffusion-inpainting

pre-trained stable-diffusion-v1-5

fine-tuned stable-diffusion-v1-5 (no inpainting)

fine-tuned stable-diffusion-v1-5 (inpainting)

sayakpaul left a comment

Choose a reason for hiding this comment

cryptexis commented Mar 3, 2024

sayakpaul commented Mar 3, 2024

cryptexis commented Mar 3, 2024

crapthings commented Mar 4, 2024

sayakpaul commented Mar 4, 2024

Srinivasa-N707 commented Mar 4, 2024

patil-suraj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Apr 4, 2024

cs-mshah commented Apr 11, 2024

yiyixuxu commented Apr 11, 2024 • edited Loading

zijinY commented May 2, 2024

github-actions bot commented Sep 14, 2024

fire2323 commented Feb 11, 2025

github-actions bot commented Mar 8, 2025

cryptexis commented Feb 9, 2024 •

edited

Loading

cryptexis commented Mar 2, 2024 •

edited

Loading

yiyixuxu commented Apr 11, 2024 •

edited

Loading