|
| 1 | +# Image Tagging using WD14Tagger |
| 2 | + |
| 3 | +This document is based on the information from this github page (https://github.com/toriato/stable-diffusion-webui-wd14-tagger#mrsmilingwolfs-model-aka-waifu-diffusion-14-tagger). |
| 4 | + |
| 5 | +Using onnx for inference is recommended. Please install onnx with the following command: |
| 6 | + |
| 7 | +```powershell |
| 8 | +pip install onnx==1.15.0 onnxruntime-gpu==1.17.1 |
| 9 | +``` |
| 10 | + |
| 11 | +The model weights will be automatically downloaded from Hugging Face. |
| 12 | + |
| 13 | +# Usage |
| 14 | + |
| 15 | +Run the script to perform tagging. |
| 16 | + |
| 17 | +```powershell |
| 18 | +python finetune/tag_images_by_wd14_tagger.py --onnx --repo_id <model repo id> --batch_size <batch size> <training data folder> |
| 19 | +``` |
| 20 | + |
| 21 | +For example, if using the repository `SmilingWolf/wd-swinv2-tagger-v3` with a batch size of 4, and the training data is located in the parent folder `train_data`, it would be: |
| 22 | + |
| 23 | +```powershell |
| 24 | +python tag_images_by_wd14_tagger.py --onnx --repo_id SmilingWolf/wd-swinv2-tagger-v3 --batch_size 4 ..\train_data |
| 25 | +``` |
| 26 | + |
| 27 | +On the first run, the model files will be automatically downloaded to the `wd14_tagger_model` folder (the folder can be changed with an option). |
| 28 | + |
| 29 | +Tag files will be created in the same directory as the training data images, with the same filename and a `.txt` extension. |
| 30 | + |
| 31 | + |
| 32 | + |
| 33 | + |
| 34 | + |
| 35 | +## Example |
| 36 | + |
| 37 | +To output in the Animagine XL 3.1 format, it would be as follows (enter on a single line in practice): |
| 38 | + |
| 39 | +``` |
| 40 | +python tag_images_by_wd14_tagger.py --onnx --repo_id SmilingWolf/wd-swinv2-tagger-v3 |
| 41 | + --batch_size 4 --remove_underscore --undesired_tags "PUT,YOUR,UNDESIRED,TAGS" --recursive |
| 42 | + --use_rating_tagss_as_last_tag --character_tags_first --character_tag_expand |
| 43 | + --always_first_tags "1girl,1boy" ..\train_data |
| 44 | +``` |
| 45 | + |
| 46 | +## Available Repository IDs |
| 47 | + |
| 48 | +[SmilingWolf's V2 and V3 models](https://huggingface.co/SmilingWolf) are available for use. Specify them in the format like `SmilingWolf/wd-vit-tagger-v3`. The default when omitted is `SmilingWolf/wd-v1-4-convnext-tagger-v2`. |
| 49 | + |
| 50 | +# Options |
| 51 | + |
| 52 | +## General Options |
| 53 | + |
| 54 | +- `--onnx`: Use ONNX for inference. If not specified, TensorFlow will be used. If using TensorFlow, please install TensorFlow separately. |
| 55 | +- `--batch_size`: Number of images to process at once. Default is 1. Adjust according to VRAM capacity. |
| 56 | +- `--caption_extension`: File extension for caption files. Default is `.txt`. |
| 57 | +- `--max_data_loader_n_workers`: Maximum number of workers for DataLoader. Specifying a value of 1 or more will use DataLoader to speed up image loading. If unspecified, DataLoader will not be used. |
| 58 | +- `--thresh`: Confidence threshold for outputting tags. Default is 0.35. Lowering the value will assign more tags but accuracy will decrease. |
| 59 | +- `--general_threshold`: Confidence threshold for general tags. If omitted, same as `--thresh`. |
| 60 | +- `--character_threshold`: Confidence threshold for character tags. If omitted, same as `--thresh`. |
| 61 | +- `--recursive`: If specified, subfolders within the specified folder will also be processed recursively. |
| 62 | +- `--append_tags`: Append tags to existing tag files. |
| 63 | +- `--frequency_tags`: Output tag frequencies. |
| 64 | +- `--debug`: Debug mode. Outputs debug information if specified. |
| 65 | + |
| 66 | +## Model Download |
| 67 | + |
| 68 | +- `--model_dir`: Folder to save model files. Default is `wd14_tagger_model`. |
| 69 | +- `--force_download`: Re-download model files if specified. |
| 70 | + |
| 71 | +## Tag Editing |
| 72 | + |
| 73 | +- `--remove_underscore`: Remove underscores from output tags. |
| 74 | +- `--undesired_tags`: Specify tags not to output. Multiple tags can be specified, separated by commas. For example, `black eyes,black hair`. |
| 75 | +- `--use_rating_tags`: Output rating tags at the beginning of the tags. |
| 76 | +- `--use_rating_tags_as_last_tag`: Add rating tags at the end of the tags. |
| 77 | +- `--character_tags_first`: Output character tags first. |
| 78 | +- `--character_tag_expand`: Expand character tag series names. For example, split the tag `chara_name_(series)` into `chara_name, series`. |
| 79 | +- `--always_first_tags`: Specify tags to always output first when a certain tag appears in an image. Multiple tags can be specified, separated by commas. For example, `1girl,1boy`. |
| 80 | +- `--caption_separator`: Separate tags with this string in the output file. Default is `, `. |
| 81 | +- `--tag_replacement`: Perform tag replacement. Specify in the format `tag1,tag2;tag3,tag4`. |
| 82 | + |
| 83 | +When specifying `remove_underscore`, specify `undesired_tags`, `always_first_tags`, and `tag_replacement` without including underscores. |
| 84 | + |
| 85 | +When specifying `caption_separator`, separate `undesired_tags` and `always_first_tags` with `caption_separator`. Always separate `tag_replacement` with `,`. |
0 commit comments