-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Docs for LoRA are confusing #3219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Valid concerns and thanks for being comprehensive about them. It helps us improving the docs tremendously. Cc: @stevhliu @yiyixuxu.
Usually, LoRA fine-tuning of the text encoder along with the UNet leads to better results than LoRA fine-tuning the UNet alone. A reference PR that might be relevant here: #3180.
The usual datasets used for text-to-image tasks are the ones that have image-caption pairs. Existing datasets include MSCoco, LAION-5B, etc. You can explore some existing datasets here: https://huggingface.co/datasets?task_categories=task_categories:text-to-image&sort=downloads. Additionally, https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions discusses how the dataset was created in the README. I guess that's helpful?
I think it's clear in the code, though. For each index of the dataset we repeat the instance prompt:
As we state here:
We strive to make the training examples a readable as possible. So, it's an expectation from the users to read the code a bit. @stevhliu @yiyixuxu what can we do here to improve on this aspect? Perhaps we can briefly include something about in the beginning of the doc so that users are more aware?
This is something we can definitely improve on. Cc: @stevhliu @yiyixuxu. I hope these pointers are helpful. |
Thanks for the feedback! 🤗
This is kind of buried at the end of the DreamBooth Training section, so we can move it up to the warning at the top of the page to provide more context.
We can spin this section about training with your own dataset out and then add links to it from each training doc. I think it'll be more visible this way.
This depends on the task you're working on (unconditional, text-to-image, DreamBooth, etc.). LoRA is a way to make training these tasks faster and more efficient, so the dataset format/task is slightly out of scope since it assumes you're familiar with the task. We can improve the DreamBooth doc here to briefly explain what the
Great idea! I think we can pull some info from these blog posts.
Maybe we can highlight and draw attention to certain parts of the script that are important similar to this? |
Thanks for your brilliant suggestions, Steven! Would you mind opening a PR to address some of these? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Apologies if API Design is not the right tag,
I was reading the docs for this page https://huggingface.co/docs/diffusers/training/lora and for a number of reasons it seemed very confusing to me as a end-user:
Cheers, this library is definitely the best API for using deep learning models of any kind
The text was updated successfully, but these errors were encountered: