Our great sponsors
-
x-stable-diffusion
Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
However, I suggest you "accelerate" your inference first. For example, you can use open-source inference engines (see: https://github.com/stochasticai/x-stable-diffusion) to easily accelerate your inference 2x or more. That means you can generates 2x more images / $ on public clouds.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- [D] Is there an affordable way to host a diffusers Stable Diffusion model publicly on the Internet for "real-time"-inference? (CPU or Serverless GPU?)
- 30% Faster than xformers? voltaML vs xformers stable diffusion - NVIDIA 4090
- [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels
- Convert Pegasus model to ONNX [Discussion]
- [P] What we learned by benchmarking TorchDynamo (PyTorch team), ONNX Runtime and TensorRT on transformers model (inference)