Our great sponsors
-
voltaML
⚡VoltaML is a lightweight library to convert and run your ML/DL deep learning models in high performance inference runtimes like TensorRT, TorchScript, ONNX and TVM.
Follow us here to get updates on the SD acceleration -> https://github.com/VoltaML/voltaML
-
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Compare it to AITemplate please, suspect it won't be faster.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
x-stable-diffusion
Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
I was looking at this three days ago, the problem is there seems to be a huge difference in what is being generated looking at the example spread on https://github.com/stochasticai/x-stable-diffusion , whereas copying model, params, seed should be giving a near identical image.
-
While I don't make the comment to necessarily disagree, that's not accurate for those using xformers. Also, there's some evidence that indicates that while specific video card make/models may be reproducible to themselves, other make/models might not be. Again, not meant to contradict the point you were trying to make but subtle non-determinism is creeping around quite a bit in SD. FWIW.
-
d8ahazard/sd_dreambooth_extension (github.com)
-
Amazing! Is there any chance your technology could be applied to training/using OpenAI's Jukebox as well? I use both SD and Jukebox, and sadly Jukebox takes aggggges to generate even a minute of audio. (https://openai.com/blog/jukebox/)