You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/index.mdx
+67-52
Original file line number
Diff line number
Diff line change
@@ -16,61 +16,76 @@ specific language governing permissions and limitations under the License.
16
16
<br>
17
17
</p>
18
18
19
-
# 🧨 Diffusers
19
+
# Diffusers
20
20
21
-
🤗 Diffusers provides pretrained vision and audio diffusion models, and serves as a modular toolbox for inference and training.
21
+
🤗 Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or want to train your own diffusion model, 🤗 Diffusers is a modular toolbox that supports both. Our library is designed with a focus on [usability over performance](conceptual/philosophy#usability-over-performance), [simple over easy](conceptual/philosophy#simple-over-easy), and [customizability over abstractions](conceptual/philosophy#tweakable-contributorfriendly-over-abstraction).
22
22
23
-
More precisely, 🤗 Diffusers offers:
23
+
The library has three main components:
24
24
25
-
- State-of-the-art diffusion pipelines that can be run in inference with just a couple of lines of code (see [**Using Diffusers**](./using-diffusers/conditional_image_generation)) or have a look at [**Pipelines**](#pipelines) to get an overview of all supported pipelines and their corresponding papers.
26
-
- Various noise schedulers that can be used interchangeably for the preferred speed vs. quality trade-off in inference. For more information see [**Schedulers**](./api/schedulers/overview).
27
-
- Multiple types of models, such as UNet, can be used as building blocks in an end-to-end diffusion system. See [**Models**](./api/models) for more details
28
-
- Training examples to show how to train the most popular diffusion model tasks. For more information see [**Training**](./training/overview).
25
+
- State-of-the-art [diffusion pipelines](api/pipelines/overview) for inference with just a few lines of code.
26
+
- Interchangeable [noise schedulers](api/schedulers/overview) for balancing trade-offs between generation speed and quality.
27
+
- Pretrained [models](api/models) that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems.
<pclass="text-gray-700">Learn the fundamental skills you need to start generating outputs, build your own diffusion system, and train a diffusion model. We recommend starting here if you're using 🤗 Diffusers for the first time!</p>
<pclass="text-gray-700">Practical guides for helping you load pipelines, models, and schedulers. You'll also learn how to use pipelines for specific tasks, control how outputs are generated, optimize for inference speed, and different training techniques.</p>
<pclass="text-gray-700">Understand why the library was designed the way it was, and learn more about the ethical guidelines and safety implementations for using the library.</p>
|[stable_diffusion_attend_and_excite](./api/pipelines/stable_diffusion/attend_and_excite)|[**Attend and Excite for Stable Diffusion**](https://attendandexcite.github.io/Attend-and-Excite/)| Text-to-Image Generation |
|[stochastic_karras_ve](./api/pipelines/stochastic_karras_ve)|[**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364)| Unconditional Image Generation |
70
-
|[unclip](./api/pipelines/unclip)|[Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125)| Text-to-Image Generation |
71
-
|[versatile_diffusion](./api/pipelines/versatile_diffusion)|[Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332)| Text-to-Image Generation |
72
-
|[versatile_diffusion](./api/pipelines/versatile_diffusion)|[Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332)| Image Variations Generation |
73
-
|[versatile_diffusion](./api/pipelines/versatile_diffusion)|[Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332)| Dual Image and Text Guided Generation |
74
-
|[vq_diffusion](./api/pipelines/vq_diffusion)|[Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822)| Text-to-Image Generation |
75
-
76
-
**Note**: Pipelines are simple examples of how to play around with the diffusion systems as described in the corresponding papers.
52
+
| Pipeline | Paper/Repository | Tasks |
53
+
|---|---|:---:|
54
+
|[alt_diffusion](./api/pipelines/alt_diffusion)|[AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679)| Image-to-Image Text-Guided Generation |
|[controlnet](./api/pipelines/stable_diffusion/controlnet)|[Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543)| Image-to-Image Text-Guided Generation |
57
+
|[cycle_diffusion](./api/pipelines/cycle_diffusion)|[Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance](https://arxiv.org/abs/2210.05559)| Image-to-Image Text-Guided Generation |
|[stochastic_karras_ve](./api/pipelines/stochastic_karras_ve)|[Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364)| Unconditional Image Generation |
87
+
|[unclip](./api/pipelines/unclip)|[Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125)(implementation by [kakaobrain](https://github.com/kakaobrain/karlo)) | Text-to-Image Generation |
88
+
|[versatile_diffusion](./api/pipelines/versatile_diffusion)|[Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332)| Text-to-Image Generation |
89
+
|[versatile_diffusion](./api/pipelines/versatile_diffusion)|[Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332)| Image Variations Generation |
90
+
|[versatile_diffusion](./api/pipelines/versatile_diffusion)|[Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332)| Dual Image and Text Guided Generation |
91
+
|[vq_diffusion](./api/pipelines/vq_diffusion)|[Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822)| Text-to-Image Generation |
0 commit comments