-
transformer-deploy
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
The end to end tutorial: https://github.com/ELS-RD/transformer-deploy/blob/main/demo/quantization_end_to_end.ipynb
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
-
Convert Pegasus model to ONNX [Discussion]
-
[P] What we learned by benchmarking TorchDynamo (PyTorch team), ONNX Runtime and TensorRT on transformers model (inference)
-
[D] Is there an affordable way to host a diffusers Stable Diffusion model publicly on the Internet for "real-time"-inference? (CPU or Serverless GPU?)
-
[D]deploy stable diffusion
-
30% Faster than xformers? voltaML vs xformers stable diffusion - NVIDIA 4090