llama.cpp
zed
| llama.cpp | zed | |
|---|---|---|
| 1,032 | 288 | |
| 115,929 | 85,086 | |
| 7.4% | 6.3% | |
| 10.0 | 10.0 | |
| 3 days ago | 1 day ago | |
| C++ | Rust | |
| MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama.cpp
-
How to Setup a Local Coding Agent on macOS
> The benchmark prompt was:
> Write a compact Python function that parses a unified diff and returns the changed file paths. Then explain two edge cases.
> Each benchmark generated about 128 tokens.
Generating 128 tokens is probably not enough for good benchmark results. MTP speedup depends on how often the predicted tokens are accepted. In my experience, the very early output has a higher acceptance rate, so short testing can give false positive speedups.
Also llama.cpp includes a tool specifically for benchmarking:
https://github.com/ggml-org/llama.cpp/blob/master/tools/llam...
-
Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)
In my build, MTP came from mainline llama.cpp, not ik_llama. ik_llama got me to ~47 (engine + quant), but I couldn't get MTP running there ā my build rejected the -mtp flags and ignored the model's nextn tensors. Mainline llama.cpp added MTP fairly recently (PR #22673, merged 2026-05-16), and that's where it worked for me. (There may well be an ik_llama path I missed ā this is just what got it going on my box.)
- New `llama.cpp` Updates, AI Agents for Any LLM, and Quantized Vector Index for Local Inference
- Gemma 4 QAT models: Optimizing compression for mobile and laptop efficiency
-
Introducing LlamaStash: a zero-overhead, terminal-native llama.cpp launcher
That script grew up. Today I'm releasing LlamaStash, the first public release of a fast, cross-platform, terminal-native launcher for llama.cpp with zero overhead.
-
How fast is LlamaStash? Overhead, throughput, and a fair comparison with Ollama and LM Studio
LlamaStash spawns the unmodified upstream llama-server. So three different questions follow from that, and there is a benchmark suite for each.
-
A 10 year old Xeon is all you need (for 26B-A4B MTP Drafters without GPU)
llama.cpp includes a benchmarking tool called llama-bench https://github.com/ggml-org/llama.cpp/blob/master/tools/llam...
ik_llama includes llama-sweep-bench https://github.com/ikawrakow/ik_llama.cpp/blob/main/examples...
When comparing hardware, the output of these tools is very helpful to let others put it into context. The post says the output is "reading speed" but knowing the prefill and token generation speeds would be a lot more helpful.
-
Racket v9.2 is now available
lol the same way we implement all of the reduced precision fp8, fp4 types today: by storing them in the corresponding uint:
https://github.com/ggml-org/llama.cpp/discussions/15095
- Run Gemma-4 E2B-it with llama.cpp on Raspberry Pi4
-
Gemma 4 dense by default: why your local agent doesn't want the MoE
# Build llama.cpp with Metal backend git clone https://github.com/ggml-org/llama.cpp cd llama.cpp && cmake -B build -DGGML_METAL=ON && cmake --build build -j # Community-quantized GGUFs (Google ships safetensors; unsloth ships GGUF) huggingface-cli download unsloth/gemma-4-31B-it-GGUF \ gemma-4-31B-it-Q4_K_M.gguf --local-dir . huggingface-cli download unsloth/gemma-4-26B-A4B-it-GGUF \ gemma-4-26B-A4B-it-Q4_K_M.gguf --local-dir . # Benchmark: 200 generations of 512 tokens, log per-call timing ./build/bin/llama-bench -m gemma-4-31B-it-Q4_K_M.gguf -n 512 -r 200 -o json > dense.json ./build/bin/llama-bench -m gemma-4-26B-A4B-it-Q4_K_M.gguf -n 512 -r 200 -o json > moe.json
zed
-
Software Is Made Between Commits
A bit O/T but:
> I have never been a big fan of pull requests.
I guess this partly explains why Zed (still) lacks a PR review flow, let alone a coherent one, despite some interest [1]. Pretty much the only reason Iām still with JetBrains.
[1]: https://github.com/zed-industries/zed/discussions/34759
- Fable 5 is available in Zed
-
Ask HN: What is your (AI) dev tech stack / workflow? (June 2026)
Devcontainers + Claude + Pi
[1] Zed https://zed.dev/
[2] Terminal threads https://zed.dev/blog/terminal-threads
As sort of byproduct also replaced Alacritty + Zellij (i just don't have the need to use more, 3 weeks of new setup)
-
Tools I'm Using in 2026 (and what I've stopped using from 2025)
That said, sometimes you do just want an editor, not a full blown IDE, and for that for the last few months I've been experimenting with Zed. It's okay. I've had some weird issues with their terminal emulator; I don't know what they are doing but my ZSH config doesn't load right so it sometimes gets stuck in what looks like an infinite loop, and then my PATH is all messed up... IDK, I expect my editors to kinda just work, but overall, I like the look and feel and for when I just need to edit a file, it's fine.
-
GitHub confirms breach of 3,800 repos via malicious VSCode extension
That's a link to a hacker news post, which links to a reddit post, which links to https://github.com/zed-industries/zed/issues/12589 if anyone wants to go right to the 'open' issue.
-
Building a native terminal for AI coding agents in Rust + GPUI
This is a post-mortem, not a launch post. Paneflow is a native terminal workspace, splits, panes, branch-aware workspaces, session restore, built in pure Rust on top of Zed's GPUI framework and the upstream alacritty_terminal crate. It started as a port of cmux, a macOS-only Swift/AppKit project, and the Rust rewrite forced a string of decisions I had no good intuition for at the start. I want to walk through the ones that mattered: which UI frameworks I tried and rejected, how the GPUI/alacritty boundary actually looks, how dev-server detection works under the hood, the N-ary layout tree that replaced binary splits, the cross-platform PTY plumbing, the JSON-RPC control plane that makes agents first-class, and four lessons that surprised me.
- Zed Editor Theme-Builder
-
Zed is 1.0
Copying my own comment below, with GH links and my (non-AI) summary after skimming:
> https://github.com/zed-industries/zed/issues/7054
> https://github.com/zed-industries/zed/issues/12589
> TL;DR: Mix of language tooling, unsigned proprietary blobs, corrupted and/or GLIBC-dependent files, redundant copies of already-installed executables. The Node packages especially are able to run scripts on install. Personal preference aside, might also create issues with security laws, certifications. All without user consent.
> Issues opened in January and June 2024. They've been rejected, closed, and opened a couple times since then. No changes directly improving this yet as of April 2026.
So... If you want broad language support via LSP servers, then you're going to have to bring in other ecosystems, and Node/Typescript is a big one that doesn't always have alternatives. [0] That's not a Zed-specific problem.
IMO the real issue with Zed is the "runs them by default without asking" part. Plus the questionable practices with binary blobs and the cavalier attitude towards it, when I can just use an editor that... Doesn't do any of that.
[0] https://microsoft.github.io/language-server-protocol/impleme...
-
Parallel Agents in Zed
Just injecting this here: What I've been missing is an equivalent for GitHub's "blame prior revision" feature to quickly follow through the history of individual source lines.
https://github.com/zed-industries/zed/discussions/42583
Thanks for building an awesome product :)
-
SpaceX and Cursor partnership. Right to acquire Cursor later this year
Zed - https://zed.dev/
Integrates a lot of agents (I use it with OpenRouter and directly with Pi) natively, is fast (you don't realise how laggy VSCode and its forks are).
Biggest disadvantage: lack of extensions. Lots of quality of life missing (e.g. gitignore integration to add/append gitignore files for different languages).
What are some alternatives?
koboldcpp - Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
helix - A post-modern modal text editor.
unsloth - Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
lapce - Lightning-fast and Powerful Code Editor written in Rust
mlc-llm - Universal LLM Deployment Engine with ML Compilation
pulsar - A Community-led Hyper-Hackable Text Editor