Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
python: 3.10.6 • torch: 1.13.1+cu117 • xformers: 0.0.16+814314d.d20230119 • commit: 54674674 • checkpoint: 61a37adf76 i get 18.79it/s .. with all shebangs installed ... triton, deepspeed, tensorrt .. did not tested with torch 2.0
I tried installing PyTorch 2.0.0, with triton from here microsoft/DeepSpeed#2694, compiling my own xformers and it made my inference even slower. From 17-18it/s 512x512, Batch size: 1, any sampling method to around 16-17it/s but especially with Batch size: 8, from 5.65it/s to 4.66it/s.