Nitro: A fast, lightweight 3MB inference server with OpenAI-Compatible API

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. llama.cpp

    LLM inference in C/C++

    Although Rust is an amazing language with rapid adoption, it still has a relatively smaller user base in low-level programming, where llama.cpp operates. As a result, the pool of talent that can contribute to such projects is more limited. Almost all low-level programmers can write in C++ if needed, as for C programmers, it's essentially like writing C with classes. Rust programmers, especially those who care about low-level details including throughput and latency, almost always have a background in C or C++. If llama.cpp were written in Rust, there would likely be far fewer contributors. Considering that one needs to be at least interested in deep learning to contribute, the fact that it currently has 476 contributors is impressive. [1] I think this is one the most important reasons the project can move so fast and be such an essential project in the LLM scene.

    [1]: https://github.com/ggerganov/llama.cpp/graphs/contributors

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. cortex.cpp

    Discontinued Local AI API Platform

  4. ollama

    Get up and running with Kimi-K2.6, GLM-5.1, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

    I recommend using https://ollama.ai/ if you dont want openai compatibility.

  5. nitro

    Next Generation Server Toolkit. Create web servers with everything you need and deploy them wherever you prefer.

    Not to be confused with https://nitro.unjs.io the server tech behind Nuxt and SolidStart

  6. omnitool

    Official Omnitool repository

    Our you could use something like omnitool (https://github.com/omnitool-ai/omnitool) and interface with both cloud and local AI, not limited to llms.

  7. metal-cpp

    Metal-cpp is a low-overhead C++ interface for Metal that helps developers add Metal functionality to graphics apps, games, and game engines that are written in C++.

    My understanding is the proliferation of “XYZ-cpp” AI frameworks is due to the c++ support in Apple’s gpu library ‘Metal’, and the popularity of apple silicon for inference (and there are a few technical reasons for this): https://developer.apple.com/metal/cpp/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Introducing Jan

    4 projects | dev.to | 5 May 2024
  • Grammarly costs $12/mo — a local LLM does it for free (Chrome + Ollama)

    2 projects | dev.to | 15 Jun 2026
  • AI Pair Programming in Your Terminal with Aider and Ollama

    1 project | dev.to | 14 Jun 2026
  • Set Up Your Own ChatGPT: Ollama + Open WebUI for Data That Never

    1 project | dev.to | 10 Jun 2026
  • I Built a Free, Fully Local AI Resume Builder — No Subscriptions, No Cloud, No Catch

    1 project | dev.to | 10 Jun 2026

Did you know that C++ is
the 7th most popular programming language
based on number of references?