stable-diffusion-webui-wd14-tagger
stable-diffusion-webui-dataset-tag-editor
stable-diffusion-webui-wd14-tagger | stable-diffusion-webui-dataset-tag-editor | |
---|---|---|
15 | 7 | |
888 | 621 | |
- | - | |
8.6 | 5.3 | |
10 months ago | 5 months ago | |
Python | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
stable-diffusion-webui-wd14-tagger
- CLIP and DeepDanbooru Alternatives For Prompt Generation [Relevant Self-Promotion]
-
Ideas for extensions?
Create an extension like 'send pictures' that uses the WD14 tagger which is way more detailed and has options for nsfw etc. Its used in Automatic1111 and Koyha ss so there's extensions you can probably implement from. https://github.com/toriato/stable-diffusion-webui-wd14-tagger
-
vladmandic-WD14-Tagger
If anyone is interested I made some changes to toriato's wd14-tagger, now it works also on vladmandic webui, repo here. You can do a new installation, or use your old automatic1111 one changing 3 files, instructions on my repo. The lora files also work (there were some problems in the vlad issue page). I'm not a programmer and it's not perfect though, in fact for now if you don't like the default tagger model you have to change it manually (instructions in the repo), and since it is basically a fork of toriato's version, if there were errors there, there will be here too.
-
Community-trained SD 1.6 Model, can we do it?
Automatic captioning tools that can be used as an initial point for captions: this tool or this one.
- Is anyone able to make the tagger extension compatible with Vlad UI ?
-
What are your favorite Extensions?
wd14-tagger, to describe anime images and get a prompt idea
-
Experiment AI Anime w/ C-Net 1.1 + GroundingDINO + SAM + MFR (workflow)
Use WD 1.4 tagger (https://github.com/toriato/stable-diffusion-webui-wd14-tagger) to extract prompt words from each frame (threshold 0.65), then use the dataset tag editor (https://github.com/toshiaki1729/stable-diffusion-webui-dataset-tag-editor) for batch editing, mainly:
-
Currently getting better results with Kohya ss Loras (Lycoris Locon) than with DB, am I alone?
I recommend using EveryDream2. You'll need an 11GB VRAM GPU. There's no need to crop or resize images, just caption them, which can be done automatically with CLIP Interrogator or WD14 taggers. Make sure to add the trigger word for your subject. It's not a Dreambooth script; it's actual training, so it shouldn't be as destructive to the model as Dreambooth. Typically, using an LR of 1e-6 with a cosine scheduler over two epochs and a batch size of 4 works fine. This script supports validation, so you can actually watch in real-time whether the training is going well or if you're overfitting. I got very good results using it.
-
For Lora training, isn’t there a good AI that discribes the pictures you want to use for training?
In my current process, I use CLIP Interrogator to produce a high level caption and wd14 tagger for more granular booru tags. Typically in that order, because you can append the results from the latter to the former. Both tools perform with greater accuracy than the standard interrogators in img2img and give you more flexibility and features as well. You still have to do some manual adjustments, but I generally prefer this process over starting from scratch.
- Captioning LoRA's
stable-diffusion-webui-dataset-tag-editor
-
Using hydrus for managing tags of training data
There are few tools for mass tagging data. Each with their own problems. * stable-diffusion-webui-dataset-tag-editor has good features. But it also has bugs that make it nearly unusable. It is also resource heavy as it runs in the webUI with stable diffusion, and stable diffusion always has models loaded. * BooruDatasetTagManager lacks many useful features.
-
What program to use for mass editing tags for training images?
I tried stable-diffusion-webui-dataset-tag-editor but it has a bug where it would get confused and sometimes swap tags from one image to another ruining everything.
-
Experiment AI Anime w/ C-Net 1.1 + GroundingDINO + SAM + MFR (workflow)
Use WD 1.4 tagger (https://github.com/toriato/stable-diffusion-webui-wd14-tagger) to extract prompt words from each frame (threshold 0.65), then use the dataset tag editor (https://github.com/toshiaki1729/stable-diffusion-webui-dataset-tag-editor) for batch editing, mainly:
-
Civitai should enforce a replicability check
If you haven't come across them yet, these two guides: this and this are good reads, and this one for info about learning rates. Beyond what those guides give info on, there are two points in which I noticed a large increase in my Lora quality- better captioning, and when I resized all the images to have about the same amount of pixels as was being trained. For captioning I have a text file with types of tags I know I'll have to hit- subject (solo, 1girl, 1boy, those early tags), what kind of perspective- portrait, closeup, full body, etc, where the character is looking (looking up, looking to the side, looking at viewer, etc), what the perspective of the viewer is (from above, from below, pov, etc), and I write down common clothing tags for the character. So I have that off to the side, and then I load up this extension for webui. It has a bit of learning curve, but I point it at what pictures I've gotten and get it to interrogate with all the models it offers except blip, and set the confidence threshold to 0.10 so it's spitting out lots of tags. After it interrogates all the pictures, I use the database feature to remove the duplicate tags, and then I save the database so it creates all the text files. Then I go to the "edit caption of selected image" select an image to caption from the left. At that point on the right the top box should be full of tags, and the bottom one should be empty. I look at my checklist from my textfile and start hitting all the areas I need to, which doesn't take long. Then I look up at the top box and read from left to right, top to bottom, one tag a time, and if it's a relevant tag, I type it in the bottom box.
-
embed txt tags
I have been using this: https://github.com/toshiaki1729/stable-diffusion-webui-dataset-tag-editor to get tags on some random images (not for a dataset, just for ease of browsing personal photos and such) unfortunately, this exports as a txt file and doesnt know how to do xmp or tag embedding. does anyone know of a way to emb the exported txt file into the image keywords/categories/whatever it supports (based on format) or a quick way to convert it to an xmp sidecar file? not necessarily related to ai generation, but it is related to ai usage. hopefully someone knows the answer or can point me where to find it.
-
Automatic1111 extensions. What're your must-haves?
Dataset Tag Editor is perfect for editing large datasets and their caption files. It's been around for a couple months and I only found out about it the other day. I could have saved so much time manually editing hundreds of caption files....
-
Questions About Improving Embeddings/Hypernetwork Results
There is one extension I use however: https://github.com/toshiaki1729/stable-diffusion-webui-dataset-tag-editor
What are some alternatives?
clip-interrogator - Image to prompt with BLIP and CLIP
BooruDatasetTagManager
batch-face-swap - Automaticaly detects faces and replaces them
sd-webui-additional-networks
sd_dreambooth_extension
kohya-trainer - Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
stable-diffusion-webui - Stable Diffusion web UI
stable-diffusion-webui-depthmap-script - High Resolution Depth Maps for Stable Diffusion WebUI
automatic - SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
sd-webui-image-sequence-toolkit - Extension for AUTOMATIC111's WebUI