DeepSpeed-MII
AITemplate
Our great sponsors
DeepSpeed-MII | AITemplate | |
---|---|---|
6 | 37 | |
1,629 | 4,448 | |
7.0% | 1.2% | |
8.7 | 9.1 | |
6 days ago | 9 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DeepSpeed-MII
- Stable Diffusion plus DeepSpeed
-
[D] When chatGPT stops being free: Run SOTA LLM in cloud
Microsoft/DeepSpeed-MII for an up 40x reduction on inference cost on Azure, this thing also supports int8 and fp16 bloom out of the box, but it fails on Azure due to instance size.
- Image Creation Time for each GPU.
-
Anyone tried DeepSpeed-MII with stablediffusion?
Haven't tried it yet but they have some example code here: https://github.com/microsoft/DeepSpeed-MII/blob/main/examples/local/txt2img-example.py
- [P] Pure C/C++ port of OpenAI's Whisper
AITemplate
-
Show HN: Shortbread, a web app that helps you create AI comics in minutes
VoltaML is a relatively vanilla diffusers-based backend, so its not a hairy monster to hack like you may have seen with SAI-based UIs.
The AITTemplate code is a lightly modified version of Facebook's example, code, to get rid of small issues like VRAM spikes: https://github.com/facebookincubator/AITemplate/tree/main/ex...
InvokeAI is also diffusers based, but they seem to mess with the pipeline a bit more.
And anyway, all that may be a better reference for interesting features rather than a backend to try and adopt.
-
List of all the ways to improve performance for stable diffusion.
let me know if you discover any more ways to improve SD. I am currently looking into facebooks AITemplate : https://github.com/facebookincubator/AITemplate
- [R] AITemplate Python to AMD compiler {META}
-
Nearly 2x speedup for SD rendering using AITemplate
Link to AITemplate itself: https://github.com/facebookincubator/AITemplate
- Render a neural network into CUDA/HIP code
- Render neural network into CUDA/HIP code
- AITemplate: a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
- A1111 vs Olive vs AITemplate.
What are some alternatives?
whisper.cpp - Port of OpenAI's Whisper model in C/C++
stable-diffusion-webui - Stable Diffusion web UI
petals - 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
nebuly - The user analytics platform for LLMs
xformers - Hackable and optimized Transformers building blocks, supporting a composable construction.
whisper-rs - Rust bindings to https://github.com/ggerganov/whisper.cpp
voltaML - âš¡VoltaML is a lightweight library to convert and run your ML/DL deep learning models in high performance inference runtimes like TensorRT, TorchScript, ONNX and TVM.
XNNPACK - High-efficiency floating-point neural network inference operators for mobile, server, and Web
stable-diffusion-tensorflow - Stable Diffusion in TensorFlow / Keras
rocm-gfx803