You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+27-36
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,15 @@ Inference of [Stable Diffusion](https://github.com/CompVis/stable-diffusion) in
10
10
11
11
- Plain C/C++ implementation based on [ggml](https://github.com/ggerganov/ggml), working in the same way as [llama.cpp](https://github.com/ggerganov/llama.cpp)
12
12
- Super lightweight and without external dependencies.
13
+
- SD1.x and SD2.x support
13
14
- 16-bit, 32-bit float support
14
15
- 4-bit, 5-bit and 8-bit integer quantization support
15
16
- Accelerated memory-efficient CPU inference
16
17
- Only requires ~2.3GB when using txt2img with fp16 precision to generate a 512x512 image, enabling Flash Attention just requires ~1.8GB.
17
18
- AVX, AVX2 and AVX512 support for x86 architectures
18
-
- SD1.x and SD2.x support
19
19
- Full CUDA backend for GPU acceleration, for now just for float16 and float32 models. There are some issues with quantized models and CUDA; it will be fixed in the future.
20
+
- Can load ckpt, safetensors and diffusers models/checkpoints. Standalone VAEs models.
21
+
- No need to convert to `.ggml` or `.gguf` anymore!
20
22
- Flash Attention for memory usage optimization (only cpu for now).
21
23
- Original `txt2img` and `img2img` mode
22
24
- Negative prompt
@@ -68,7 +70,7 @@ git submodule init
68
70
git submodule update
69
71
```
70
72
71
-
### Convert weights
73
+
### Download weights
72
74
73
75
- download original weights(.ckpt or .safetensors). For example
74
76
- Stable Diffusion v1.4 from https://huggingface.co/CompVis/stable-diffusion-v-1-4-original
You can specify the model weight type using the `--type` parameter. The weights are automatically converted when loading the model.
158
+
159
+
- `f16` for 16-bit floating-point
160
+
- `f32` for 32-bit floating-point
161
+
- `q8_0` for 8-bit integer quantization
162
+
- `q5_0` or `q5_1` for 5-bit integer quantization
163
+
- `q4_0` or `q4_1` for 4-bit integer quantization
164
+
167
165
#### txt2img example
168
166
169
-
```
170
-
./bin/sd -m ../sd-v1-4-f16.gguf -p "a lovely cat"
167
+
```sh
168
+
./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"
169
+
# ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat"
171
170
```
172
171
173
172
Using formats of different precisions will yield results of varying quality.
@@ -182,7 +181,7 @@ Using formats of different precisions will yield results of varying quality.
182
181
183
182
184
183
```
185
-
./bin/sd --mode img2img -m ../models/sd-v1-4-f16.gguf -p "cat with blue eyes" -i ./output.png -o ./img2img_output.png --strength 0.4
184
+
./bin/sd --mode img2img -m ../models/sd-v1-4.ckpt -p "cat with blue eyes" -i ./output.png -o ./img2img_output.png --strength 0.4
186
185
```
187
186
188
187
<palign="center">
@@ -191,24 +190,17 @@ Using formats of different precisions will yield results of varying quality.
191
190
192
191
#### with LoRA
193
192
194
-
- convert lora weights to gguf model format
195
-
196
-
```shell
197
-
bin/convert [lora path] -t f16
198
-
# For example, bin/convert marblesh.safetensors -t f16
199
-
```
200
-
201
193
- You can specify the directory where the lora weights are stored via `--lora-model-dir`. If not specified, the default is the current working directory.
202
194
203
195
- LoRA is specified via prompt, just like [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#lora).
204
196
205
197
Here's a simple example:
206
198
207
199
```
208
-
./bin/sd -m ../models/v1-5-pruned-emaonly-f16.gguf -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models
200
+
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models
209
201
```
210
202
211
-
`../models/marblesh.gguf` will be applied to the model
203
+
`../models/marblesh.safetensors` or `../models/marblesh.ckpt` will be applied to the model
0 commit comments