Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
transformers.js
State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
Gonna respond here and correct both comments
>Some context for those who aren't in the loop: ONNX Runtime (https://onnxruntime.ai/) is a standardization format for AI models.
It's just an IR, one of many - every framework has its own.
>Nowadays, it's extremely easy to export models in the ONNX format, especially language models with tools like Hugging Face transformers which have special workflows for it.
Meh it's poorly supported by both PyTorch and TF. Why support Microsoft's IR when you have your own.
>probably the most performant ML runtime at this point.
Not even by a long-shot - first party compilers are generally faster because of smoother interop but even amongst third-party you have TRT and TVM. TBH I have no idea what anyone uses ONNX for these days (legacy?).
I was going to answer the same, I find the approach of machine learning compilers that directly compile models to host and device code better than having to bring a huge runtime. There are exciting projects in this area like TVM Unity, IREE [2], or torch.export [3]
[1] https://github.com/apache/tvm/tree/unity
[2] https://pytorch.org/get-started/pytorch-2.0/#inference-and-e...
[3] https://pytorch.org/get-started/pytorch-2.0/#inference-and-e...