tomesd
AITemplate
tomesd | AITemplate | |
---|---|---|
18 | 37 | |
1,207 | 4,455 | |
- | 0.7% | |
5.4 | 8.7 | |
5 months ago | 2 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tomesd
-
List of all the ways to improve performance for stable diffusion.
They show up to 5.4 times greater: you can see his results in the image on the github repo here: https://github.com/dbolya/tomesd
-
Question about automatic1111 set up after changing gpu
Another optimization extension you can use as well is token merging which has reported around 5.4x faster image generation.
- +39%~51% faster at the cost of some details? ToMe officially arrives to Auto1111's webui v1.3.0
-
AUTOMATIC1111 updated to 1.3.0 version
It merges redundant tokens: https://github.com/dbolya/tomesd So it can make the generation slightly faster.
-
I made some changes in AUTOMATIC1111 SD webui, faster but lower VRAM usage
Mods patched - Tomesd - Pillow-SIMD - OpenCV-CUDA (WIP) - Removed some unused imports and startup checking - Improved performance with reduced VRAM usage (tested on txt2img only) - Added a new option to use external RealESRGAN with --external-realesrgan
-
Honest question, how are people getting ~35-40 it/sec on 4090? My spits 20 at most
Were the 40 it/s perhaps achieved with ToMe?
-
Vlad diffusion keeps growing. Big thanx to all supporters :)
Done! Proposal
-
Token Merging actually works and reduces generation time as well as RAM
This feature comes from this project: https://github.com/dbolya/tomesd
-
How can I squeeze every ounce of performance from web UI?
GitHub - dbolya/tomesd: Speed up Stable Diffusion with this one simple trick!
- Token Merging for Fast Stable Diffusion
AITemplate
-
Show HN: Shortbread, a web app that helps you create AI comics in minutes
VoltaML is a relatively vanilla diffusers-based backend, so its not a hairy monster to hack like you may have seen with SAI-based UIs.
The AITTemplate code is a lightly modified version of Facebook's example, code, to get rid of small issues like VRAM spikes: https://github.com/facebookincubator/AITemplate/tree/main/ex...
InvokeAI is also diffusers based, but they seem to mess with the pipeline a bit more.
And anyway, all that may be a better reference for interesting features rather than a backend to try and adopt.
-
List of all the ways to improve performance for stable diffusion.
let me know if you discover any more ways to improve SD. I am currently looking into facebooks AITemplate : https://github.com/facebookincubator/AITemplate
- [R] AITemplate Python to AMD compiler {META}
-
Nearly 2x speedup for SD rendering using AITemplate
Link to AITemplate itself: https://github.com/facebookincubator/AITemplate
- Render a neural network into CUDA/HIP code
- Render neural network into CUDA/HIP code
- AITemplate: a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
- A1111 vs Olive vs AITemplate.
What are some alternatives?
stable-diffusion-webui-ux - Stable Diffusion web UI UX
stable-diffusion-webui - Stable Diffusion web UI
ComfyUI - The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
nebuly - The user analytics platform for LLMs
stable-diffusion-webui-tensorrt
xformers - Hackable and optimized Transformers building blocks, supporting a composable construction.
automatic - SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
voltaML - ⚡VoltaML is a lightweight library to convert and run your ML/DL deep learning models in high performance inference runtimes like TensorRT, TorchScript, ONNX and TVM.
sd-extension-system-info - System and platform info and standardized benchmarking extension for SD.Next and WebUI
stable-diffusion-tensorflow - Stable Diffusion in TensorFlow / Keras
diffusers - 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
rocm-gfx803