serving-compare-middleware
tritony
Our great sponsors
serving-compare-middleware | tritony | |
---|---|---|
1 | 1 | |
14 | 37 | |
- | - | |
0.0 | 6.4 | |
10 months ago | 5 months ago | |
Python | Python | |
MIT License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
serving-compare-middleware
-
A Quantitative Comparison of Serving Platforms for Neural Networks
For this experiment we ran the models (respectively servings) using Docker Compose. You can find the relevant manifests here: https://github.com/Biano-AI/serving-compare-middleware/blob/master/docker-compose.test.yml
tritony
-
Are you using `Triton Inference Server`?
Check it https://github.com/rtzr/tritony !
What are some alternatives?
Real-Time-Voice-Cloning - Clone a voice in 5 seconds to generate arbitrary speech in real-time
vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
budgetml - Deploy a ML inference service on a budget in less than 10 lines of code.
jina - ☁️ Build multimodal AI applications with cloud-native stack
DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
quick-deploy - Optimize, convert and deploy machine learning models as fast inference API using Triton and ORT. Currently support Hugging Face transformers, PyToch, Tensorflow, SKLearn and XGBoost models.
transformer-deploy - Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
ColossalAI - Making large AI models cheaper, faster and more accessible
nni - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
d2l-en - Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.