huggingface · yiyixuxu · Jul 1, 2024 · Jun 29, 2024 · Jun 29, 2024 · Jul 1, 2024
diff --git a/docs/source/en/api/pipelines/hunyuandit.md b/docs/source/en/api/pipelines/hunyuandit.md
@@ -34,6 +34,12 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.m
 
 </Tip>
 
+<Tip>
+
+Once you have generated an image using [`HunyuanDiTPipeline`], you can pass that image to SDXL refiner to further improve the final generation quality. Check out this [guide](https://huggingface.co/docs/diffusers/using-diffusers/sdxl#base-to-refiner-model) on how to use SDXL refiner model.
+
+</Tip>
+
 ## Optimization
 
 You can optimize the pipeline's runtime and memory consumption with torch.compile and feed-forward chunking. To learn about other optimization methods, check out the [Speed up inference](../../optimization/fp16) and [Reduce memory usage](../../optimization/memory) guides.

diff --git a/docs/source/en/api/pipelines/pixart_sigma.md b/docs/source/en/api/pipelines/pixart_sigma.md
@@ -37,6 +37,12 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers)
 
 </Tip>
 
+<Tip>
+
+Once you have generated an image using [`PixArtSigmaPipeline`], you can pass that image to SDXL refiner to further improve the final generation quality. Check out this [guide](https://huggingface.co/docs/diffusers/using-diffusers/sdxl#base-to-refiner-model) on how to use SDXL refiner model.
+
+</Tip>
+
 ## Inference with under 8GB GPU VRAM
 
 Run the [`PixArtSigmaPipeline`] with under 8GB GPU VRAM by loading the text encoder in 8-bit precision. Let's walk through a full-fledged example.

diff --git a/docs/source/en/using-diffusers/sdxl.md b/docs/source/en/using-diffusers/sdxl.md
@@ -285,6 +285,12 @@ refiner = DiffusionPipeline.from_pretrained(
 ).to("cuda")
 ```
 
+<Tip warning={true}>
+
+You can also use SDXL refiner with a different base model. For example,you can use[Hunyuan-DiT](https://huggingface.co/docs/diffusers/api/pipelines/hunyuandit) or [PixArt-Sigma](https://huggingface.co/docs/diffusers/api/pipelines/pixart_sigma) Text-to-Image pipelines to generate images with better prompt adherence. Once you have generated an image, you can then pass it to SDXL refiner to enhance final generation quality.
+
+</Tip>
+
 Generate an image from the base model, and set the model output to **latent** space:
 
 ```py

diff --git a/src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3_img2img.py b/src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3_img2img.py
@@ -62,7 +62,7 @@
         >>> pipe = pipe.to(device)
 
         >>> url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
-        >>> init_image = load_image(url).resize((512, 512))
+        >>> init_image = load_image(url).resize((1024, 1024))
 
         >>> prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
-Original file line number
+Diff line change
@@ Expand Up @@
     </Tip>
+    <Tip>
+    Once you have generated an image using [`HunyuanDiTPipeline`], you can pass that image to SDXL refiner to further improve the final generation quality. Check out this [guide](https://huggingface.co/docs/diffusers/using-diffusers/sdxl#base-to-refiner-model) on how to use SDXL refiner model.
+    </Tip>
     ## Optimization
     You can optimize the pipeline's runtime and memory consumption with torch.compile and feed-forward chunking. To learn about other optimization methods, check out the [Speed up inference](../../optimization/fp16) and [Reduce memory usage](../../optimization/memory) guides.
@@ Expand Down @@