

-
To solve this, we built a native extension in Edge Runtime that enables using ONNX runtime via the Rust interface. This was made possible thanks to an excellent Rust wrapper called Ort:
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
Embedding generation uses the ONNX runtime under the hood. This is a cross-platform inferencing library that supports multiple execution providers from CPU to specialized GPUs.
-
supabase
The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.
Semantic search demo
-
ollama
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
LLM models are challenging to run directly via ONNX runtime on CPU. For these, we are using a GPU-accelerated Ollama server under the hood:
Related posts
-
Day 49: Serving LLMs with ONNX Runtime
-
Show HN: Krixik – Easily sequence small/specialized AI models (pip installable)
-
[Python] How do we lazyload a Python module? - analyzing LazyLoader from MLflow
-
Running Phi-3-vision via ONNX on Jetson Platform
-
Show HN: Terge – an easy-to-use library for merging AI models