Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
seldon-core
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
You can try https://huggingface.co/ You can host and deploy any model, you can also host your datasets and private models.
The library has more than 50k stars: https://github.com/huggingface/transformers
If you want to serve your model at scale, with a bunch of production features you should have a look at the open-source framework Seldon Core. It does what you're asking for plus a bunch of other cool stuff like routing, logging and monitoring.
(Disclosure, I am a maintainer on this project) You should checkout Bridge - it deploys models directly from an MLflow registry to SageMaker inference endpoints (hosted APIs). It basically turns your registry into a declarative source of truth for your hosting. The advantage of this approach is that it provides a clean way to update/upgrade your APIs from the same place you're tracking your new versions, experiments etc. One source of truth. You can get an MLflow registry up in a couple minutes if you don't have one.
Related posts
- StreamingLLM: tiny tweak to KV LRU improves long conversations
- [D] Is there a tool that indicates which parts of the input prompt impact the LLM's output the most?
- OpenAI's employees were given two explanations for why Sam Altman was fired
- Show HN: Fully client-side GPT2 prediction visualizer
- Show Show HN: Llama2 Embeddings FastAPI Server