Skip to content

Commit 615c04d

Browse files
authored
[Pipelines] Add community pipeline for Zero123 (#4295)
* add zero123 pipeline to community * add community doc * reformat * update zero123 pipeline, including cc_projection within diffusers; add convert ckpt scripts; support diffusers weights
1 parent ae82a3e commit 615c04d

File tree

3 files changed

+1775
-0
lines changed

3 files changed

+1775
-0
lines changed

examples/community/README.md

+83
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ If a community doesn't work as expected, please open an issue and ping the autho
3939
| CLIP Guided Images Mixing Stable Diffusion Pipeline | Сombine images using usual diffusion models. | [CLIP Guided Images Mixing Using Stable Diffusion](#clip-guided-images-mixing-with-stable-diffusion) | - | [Karachev Denis](https://github.com/TheDenk) |
4040
| TensorRT Stable Diffusion Inpainting Pipeline | Accelerates the Stable Diffusion Inpainting Pipeline using TensorRT | [TensorRT Stable Diffusion Inpainting Pipeline](#tensorrt-inpainting-stable-diffusion-pipeline) | - | [Asfiya Baig](https://github.com/asfiyab-nvidia) |
4141
| IADB Pipeline | Implementation of [Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model](https://arxiv.org/abs/2305.03486) | [IADB Pipeline](#iadb-pipeline) | - | [Thomas Chambon](https://github.com/tchambon)
42+
| Zero1to3 Pipeline | Implementation of [Zero-1-to-3: Zero-shot One Image to 3D Object](https://arxiv.org/abs/2303.11328) | [Zero1to3 Pipeline](#Zero1to3-pipeline) | - | [Xin Kong](https://github.com/kxhit)
43+
4244

4345
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
4446
```py
@@ -1767,3 +1769,84 @@ while True:
17671769
loss.backward()
17681770
optimizer.step()
17691771
```
1772+
1773+
### Zero1to3 pipeline
1774+
1775+
This pipeline is the implementation of the [Zero-1-to-3: Zero-shot One Image to 3D Object](https://arxiv.org/abs/2303.11328) paper.
1776+
The original pytorch-lightning [repo](https://github.com/cvlab-columbia/zero123) and a diffusers [repo](https://github.com/kxhit/zero123-hf).
1777+
1778+
The following code shows how to use the Zero1to3 pipeline to generate novel view synthesis images using a pretrained stable diffusion model.
1779+
1780+
```python
1781+
import os
1782+
import torch
1783+
from pipeline_zero1to3 import Zero1to3StableDiffusionPipeline
1784+
from diffusers.utils import load_image
1785+
1786+
model_id = "kxic/zero123-165000" # zero123-105000, zero123-165000, zero123-xl
1787+
1788+
pipe = Zero1to3StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
1789+
1790+
pipe.enable_xformers_memory_efficient_attention()
1791+
pipe.enable_vae_tiling()
1792+
pipe.enable_attention_slicing()
1793+
pipe = pipe.to("cuda")
1794+
1795+
num_images_per_prompt = 4
1796+
1797+
# test inference pipeline
1798+
# x y z, Polar angle (vertical rotation in degrees) Azimuth angle (horizontal rotation in degrees) Zoom (relative distance from center)
1799+
query_pose1 = [-75.0, 100.0, 0.0]
1800+
query_pose2 = [-20.0, 125.0, 0.0]
1801+
query_pose3 = [-55.0, 90.0, 0.0]
1802+
1803+
# load image
1804+
# H, W = (256, 256) # H, W = (512, 512) # zero123 training is 256,256
1805+
1806+
# for batch input
1807+
input_image1 = load_image("./demo/4_blackarm.png") #load_image("https://cvlab-zero123-live.hf.space/file=/home/user/app/configs/4_blackarm.png")
1808+
input_image2 = load_image("./demo/8_motor.png") #load_image("https://cvlab-zero123-live.hf.space/file=/home/user/app/configs/8_motor.png")
1809+
input_image3 = load_image("./demo/7_london.png") #load_image("https://cvlab-zero123-live.hf.space/file=/home/user/app/configs/7_london.png")
1810+
input_images = [input_image1, input_image2, input_image3]
1811+
query_poses = [query_pose1, query_pose2, query_pose3]
1812+
1813+
# # for single input
1814+
# H, W = (256, 256)
1815+
# input_images = [input_image2.resize((H, W), PIL.Image.NEAREST)]
1816+
# query_poses = [query_pose2]
1817+
1818+
1819+
# better do preprocessing
1820+
from gradio_new import preprocess_image, create_carvekit_interface
1821+
import numpy as np
1822+
import PIL.Image as Image
1823+
1824+
pre_images = []
1825+
models = dict()
1826+
print('Instantiating Carvekit HiInterface...')
1827+
models['carvekit'] = create_carvekit_interface()
1828+
if not isinstance(input_images, list):
1829+
input_images = [input_images]
1830+
for raw_im in input_images:
1831+
input_im = preprocess_image(models, raw_im, True)
1832+
H, W = input_im.shape[:2]
1833+
pre_images.append(Image.fromarray((input_im * 255.0).astype(np.uint8)))
1834+
input_images = pre_images
1835+
1836+
# infer pipeline, in original zero123 num_inference_steps=76
1837+
images = pipe(input_imgs=input_images, prompt_imgs=input_images, poses=query_poses, height=H, width=W,
1838+
guidance_scale=3.0, num_images_per_prompt=num_images_per_prompt, num_inference_steps=50).images
1839+
1840+
1841+
# save imgs
1842+
log_dir = "logs"
1843+
os.makedirs(log_dir, exist_ok=True)
1844+
bs = len(input_images)
1845+
i = 0
1846+
for obj in range(bs):
1847+
for idx in range(num_images_per_prompt):
1848+
images[i].save(os.path.join(log_dir,f"obj{obj}_{idx}.jpg"))
1849+
i += 1
1850+
1851+
```
1852+

0 commit comments

Comments
 (0)