Skip to content

Commit 65e453f

Browse files
committedMar 6, 2024
initial reference impl
0 parents  commit 65e453f

File tree

7 files changed

+1686
-0
lines changed

7 files changed

+1686
-0
lines changed
 

‎.gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
venv/
2+
models/
3+
__pycache__/
4+
output.png
5+
output.latent

‎README.md

+45
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Stable Diffusion 3 Micro-Reference Implementation
2+
3+
Inference-only tiny reference implementation of SD3.
4+
5+
Contains code for the text encoders (OpenAI CLIP-L/14, OpenCLIP bigG, Google T5-XXL) (these models are all public), the VAE Decoder (similar to previous SD models, but 16-channels and no postquantconv step), and the core MM-DiT (entirely new).
6+
7+
Everything you need to inference SD3 excluding the weights files.
8+
9+
### Install
10+
11+
```sh
12+
python3 -s -m venv venv
13+
source ./venv/bin/activate
14+
# or on windows: venv/scripts/activate
15+
python3 -s -m pip install -r requirements.txt
16+
```
17+
18+
### Test Usage
19+
20+
```sh
21+
# Generate a cat on ref model with default settings
22+
python3 -s sd3_infer.py
23+
# Generate a 1024 cat on SD3-8B
24+
python3 -s sd3_infer.py --width 1024 --height 1024 --shift 3 --model models/sd3_8b_beta.safetensors --prompt "cute wallpaper art of a cat"
25+
```
26+
27+
Images will be output to `output.png` by default
28+
29+
### File Guide
30+
31+
- `sd3_infer.py` - entry point, review this for basic usage of diffusion model and the triple-tenc cat
32+
- `sd3_impls.py` - contains the wrapper around the MMDiT and the VAE
33+
- `other_impls.py` - contains the CLIP model, the T5 model, and some utilities
34+
- `mmdit.py` - contains the core of the MMDiT itself
35+
- folder `models` with the following files (download separately):
36+
- `clip_g.safetensors` (openclip bigG, same as SDXL, can grab a public copy)
37+
- `clip_l.safetensors` (OpenAI CLIP-L, same as SDXL, can grab a public copy)
38+
- `t5xxl.safetensors` (google T5-v1.1-XXL, can grab a public copy)
39+
- `sd3_beta.safetensors` (internal, private)
40+
41+
### Legal
42+
43+
Built by Alex Goodwin for Stability AI and private partners under NDA, heavily based on internal ComfyUI and SGM codebases. Uses some upstream logic from HuggingFace, Google, PyTorch.
44+
45+
Do not redistribute.

0 commit comments

Comments
 (0)
Please sign in to comment.