You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If there are layers of the UNet with empty weights, creating a state_dict via model_to_save.state_dict() will remove them, and you will get an error when trying to load them via load_attn_procs since there are missing keys.
The current approach is to manually add the keys of the empty weight layers to the state_dict with a empty dictionary value. This is fine when using torch.save but fails when trying to save the file using safetensors with the following error
Traceback (most recent call last):
File "examples/custom_diffusion/train_custom_diffusion.py", line 1330, in<module>
main(args)
File "examples/custom_diffusion/train_custom_diffusion.py", line 1258, in main
unet.save_attn_procs(args.output_dir, safe_serialization=not args.no_safe_serialization)
File "/diffusers/src/diffusers/loaders.py", line 568, in save_attn_procs
save_function(state_dict, os.path.join(save_directory, weight_name))
File "/diffusers/src/diffusers/loaders.py", line 534, in save_function
return safetensors.torch.save_file(weights, filename, metadata={"format": "pt"})
File "/opt/venv/lib/python3.8/site-packages/safetensors/torch.py", line 232, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
File "/opt/venv/lib/python3.8/site-packages/safetensors/torch.py", line 383, in _flatten
raise ValueError(f"Key `{k}` is invalid, expected torch.Tensor but received {type(v)}")
ValueError: Key `down_blocks.1.attentions.0.transformer_blocks.0.attn1.processor` is invalid, expected torch.Tensor but received <class 'dict'>
safetensors does not support saving dict type values. You can verify this is an issueby running the following snippet
Traceback (most recent call last):
File "examples/custom_diffusion/train_custom_diffusion.py", line 1330, in <module>
main(args)
File "examples/custom_diffusion/train_custom_diffusion.py", line 1258, in main
unet.save_attn_procs(args.output_dir, safe_serialization=not args.no_safe_serialization)
File "/diffusers/src/diffusers/loaders.py", line 568, in save_attn_procs
save_function(state_dict, os.path.join(save_directory, weight_name))
File "/diffusers/src/diffusers/loaders.py", line 534, in save_function
return safetensors.torch.save_file(weights, filename, metadata={"format": "pt"})
File "/opt/venv/lib/python3.8/site-packages/safetensors/torch.py", line 232, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
File "/opt/venv/lib/python3.8/site-packages/safetensors/torch.py", line 394, in _flatten
raise RuntimeError(
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'mid_block.attentions.0.transformer_blocks.0.attn1.processor', 'up_blocks.0.attentions.1.transformer_blocks.0.attn1.processor', 'up_blocks.0.attentions.2.transformer_blocks.0.attn1.processor', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.processor', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.processor', 'up_blocks.0.attentions.0.transformer_blocks.0.attn1.processor'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
IMO, I feel we shouldn't introduce keys with empty values when saving the state_dict. It would be better to omit the empty keys while saving, and deal with missing keys when loading the weights (perhaps via load_state_dict(state_dict, strict=True)?)
Thanks for the ping. Will follow if it becomes a prio. (I'm not sure how saving empty tensors/dicts in a file is interesting, aside from making torch happy).
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Describe the bug
Ran into this issue when working on: #4235
Diffusers now defaults to saving models via
safetensors
.In certain cases:
https://github.com/huggingface/diffusers/blob/main/examples/custom_diffusion/train_custom_diffusion.py
saving Custom Unet Attention Processors via
safetensors
is not possible. This is because of the following lines inloaders.py
diffusers/src/diffusers/loaders.py
Lines 551 to 554 in 5049599
If there are layers of the UNet with empty weights, creating a
state_dict
viamodel_to_save.state_dict()
will remove them, and you will get an error when trying to load them viaload_attn_procs
since there are missing keys.The current approach is to manually add the keys of the empty weight layers to the
state_dict
with a empty dictionary value. This is fine when usingtorch.save
but fails when trying to save the file usingsafetensors
with the following errorsafetensors
does not support savingdict
type values. You can verify this is an issueby running the following snippetYou will get the following error.
Trying to replace the dict with an empty tensor in
loader.py
results inLeads to the following error
IMO, I feel we shouldn't introduce keys with empty values when saving the
state_dict
. It would be better to omit the empty keys while saving, and deal with missing keys when loading the weights (perhaps viaload_state_dict(state_dict, strict=True)
?)Reproduction
Run the
train_custom_diffusion.py
script.Logs
No response
System Info
N/A
Who can help?
No response
The text was updated successfully, but these errors were encountered: