Rapid-MLX
vibes
| Rapid-MLX | vibes | |
|---|---|---|
| 6 | 6 | |
| 2,756 | 174 | |
| 90.1% | 10.3% | |
| 9.8 | 9.8 | |
| 4 days ago | 8 days ago | |
| Python | Go | |
| Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Rapid-MLX
-
Chrome's Gemini Nano Prompt API: A Step-by-Step Guide
💡 💡 Make the fallback cheap to operate. The whole point of using Nano on the supported path is reduced cost. If your fallback is GPT-5.5 at $5/M tokens, you've moved the bill, not deleted it. Two patterns work well: (1) route the fallback to a smaller hosted model (Haiku, Gemini Flash, Mistral Small) that matches Nano's "short summarization" sweet spot; (2) for Mac users specifically, run Rapid-MLX as your /api/llm endpoint — Apple Silicon owners get on-device performance via your server's Mac, not theirs. Same thesis as our DeepClaude guide: the harness is one product, the model is another, and you can swap them.
-
Anthropic is allowing the Claude CLI to run OpenClaw again
> Large-context requests auto-route to a cloud LLM (GPT-5, Claude, etc.) when local prefill would be slow. Routing based on new tokens after cache hit. --cloud-model openai/gpt-5 --cloud-threshold 20000
https://github.com/raullenchai/Rapid-MLX
- Show HN: Rapid-MLX – Run local LLMs on Mac, 2-3x faster than alternatives
-
Gemma 4 on Apple Silicon: 85 tok/s with a pip install
I've verified this end-to-end with structured output (output_type=BaseModel), streaming, multi-turn conversations, and multi-tool workflows. Test suite here.
-
vLLM-mlx – 65 tok/s LLM inference on Mac with tool calling and prompt caching
pip install git+https://github.com/raullenchai/vllm-mlx.git
vibes
-
Anthropic is allowing the Claude CLI to run OpenClaw again
PSA: Since you are still required to use Claude Code and I have had a bunch of non-technical people asking me to make https://github.com/rcarmo/piclaw based on Claude rather than pi (which is never gonna happen), I have started pivoting its Python grand-daddy into a Go-based web front-end that runs Claude as an ACP agent.
Still early days, but code is available, sort of works if you squint, and welcomes PRs: https://github.com/rcarmo/vibes/tree/go
- I'm going to build my own OpenClaw, with blackjack and bun
-
Pi – a minimal terminal coding harness
My current fave harness. I've been using it to great effect, since it is self-extensible, and added support for it to https://github.com/rcarmo/vibes because it is so much faster than ACP.
- A chatbot's worst enemy is page refresh
- Show HN: A simple mobile-focused Agent Control Protocol front-end
-
Ask HN: Any real OpenClaw (Clawd Bot/Molt Bot) users? What's your experience?
I ran it for a couple of days in a VM in my Proxmox cluster. It was cute, but so amazingly insecure (systemd + sudo + installing whatever it wanted, plus requiring Telegram for access - or another SIM card for Signal) that I just gave up and started building my own thing (https://github.com/rcarmo/vibes) so I could have a mobile experience I could trust over Tailscale and sandbox copilot CLI (or any ACP-compliant agent) in a container (I've also been working on https://github.com/rcarmo/webterm and https://github.com/rcarmo/agentbox, so I am 300% positive I can do better sandboxing and safer integrations...)
What are some alternatives?
Sacred - Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.
agentbox - Contain your coding agents (literally)
MindsDB - General-purpose AI designed for knowledge workers — creators, strategists, and operators — and individuals seeking AI systems they can truly control to help them get work done, with full flexibility to extend and deploy anywhere (VPC, on-prem, or cloud).
webterm - Yet another web terminal, but with style
gym - A toolkit for developing and comparing reinforcement learning algorithms.
openclaw - Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞