ollama
llm
| ollama | llm | |
|---|---|---|
| 750 | 100 | |
| 173,924 | 12,031 | |
| 2.0% | 2.3% | |
| 9.9 | 9.0 | |
| about 13 hours ago | 1 day ago | |
| Go | Python | |
| MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ollama
-
Set Up Your Own ChatGPT: Ollama + Open WebUI for Data That Never
Download: Go to https://ollama.com/ and click on the download link for your operating system.
-
I Built a Free, Fully Local AI Resume Builder — No Subscriptions, No Cloud, No Catch
Most AI resume tools call out to OpenAI or Anthropic and charge you for every request. Persona supports Ollama — which means you can run the AI model locally on your own hardware, with zero API costs and zero data leaving your machine.
-
Sovereign Synapse: The Local Brain
To solve these, we built a stack that prioritizes integrity over ease. The centerpiece is Ollama, running the mxbai-embed-large model locally. This is the engine that translates human thought into high-dimensional coordinates.
-
How I Built a Self-Funding AI Lab: From Hobby to Side Income in 6 Months
Ollama for model serving
-
Flat Chat Threads Suck for Reading Books. So I Built a Local-First AI Tree Companion.
Fully offline: Point it at Ollama or LM Studio. Zero cost, nothing leaves your network.
-
Local LLM Hardware Requirements in 2026: What You Actually Need for Every Model Tier [Guide]
Recommended hardware: The RTX 3060 with 12 GB VRAM is the budget king here — all these models fit with room to spare for KV cache overhead, even Gemma 4:12B (which needs ~8.5–9 GB with overhead). An RTX 4060 Ti 16 GB gives you more headroom. On the Apple side, any M2 or M3 MacBook with 16 GB unified memory handles these models comfortably via Ollama's Metal backend.
-
Run Coding Agents on Local AI — Zero Cloud, Full Control
This guide shows how to swap out every cloud API with a local Ollama server running qwen3-coder:30b. Same tools, same workflows, no data leaving your network.
-
Running Brand-New Gemma 4 12B on an 8-Year-Old GTX 1080 Ti: Speed, 3 Gotchas, and Why Q8 Beat Q4 on My Own Field
Related: 35B MoE on 2× 1080 Ti · Ollama
-
Agent Skills in Microsoft Agent Framework
The sample is a tiny console app running entirely against a local Ollama model — no cloud keys, and every HTTP call is traced so I can see exactly what goes over the wire (complete sample code). There's a single skill on disk:
-
Quick and easy local AI RAG setup with JetBrains IDE integration and browser UI
irm https://ollama.com/install.ps1 | iex
llm
-
I benchmarked Python AI-app security scanners. Here's what each catches.
We ran all three (working) tools against simonw/llm, Simon Willison's clean CLI for LLMs, 48 Python files.
-
Your text file is the prompt now: LLM's shebang trick
Simon Willison's llm CLI tool already does a lot — run prompts, manage models, call tools, store logs in SQLite. But a Hacker News comment last week sent him down a rabbit hole, and the result is one of those tricks that makes you stop and stare.
-
Show HN: GoModel – an open-source AI gateway in Go; 44x lighter than LiteLLM
I've been maintaining an abstraction layer over multiple providers for a couple of years now - https://llm.datasette.io/
The best effort we have to defining a standard is OpenAI harmony/responses - https://developers.openai.com/cookbook/articles/openai-harmo... - but it's not seen much pickup. The older OpenAI Chat Completions thing is much more of an ad-hoc standard - almost every provider ends up serving up a clone of that, albeit with frustrating differences because there's no formal spec to work against.
The key problem is that providers are still inventing new stuff, so committing to a standard doesn't work for them because it may not cover the next set of features.
2025 was particularly turbulent because everyone was adding reasoning mechanisms to their APIs in subtly different shapes. Tool calls and response schemas (which are confusingly not always the same thing) have also had a lot of variance - some providers allow for multiple tool calls in the same response, for example.
My hunch is we'll need abstraction layers for quite a while longer, because the shape of these APIs is still too frothy to support a standard that everyone can get behind without restricting their options for future products too much.
-
3 AIs Reviewed the Same Codebase. They Disagreed on 2 Findings. That is the Point.
Simon Willison's llm is one of the better-engineered CLI tools in the Python ecosystem. It has a clean architecture, a comprehensive plugin system, and parameterized SQL throughout. The reviewers independently noted the consistent SQL safety, which speaks to the care that has gone into the project. We pointed our tools at it and filed the findings that survived review.
-
Claude Sonnet 4.6 System Card
Took me a while to create the pelican because I was busy adding Opus/Sonnet 4.6 support to my plugin for https://llm.datasette.io/ - pelican now available here, it's not quite as good as the Opus 4.6 one but does look equivalent to the Opus 4.5 one - and it has a snazzy top hat. https://simonwillison.net/2026/Feb/17/claude-sonnet-46/
-
AI Doesn't Reduce Work–It Intensifies It
> Has Simon actually produced anything novel or compelling?
Here are some of my recent posts which I self-evaluate as "novel and compelling".
- Running Pydantic’s Monty Rust sandboxed Python subset in WebAssembly https://simonwillison.net/2026/Feb/6/pydantic-monty/ - demonstrating how easy and useful it is to be able to turn Rust code into WASM that can run independently or be used inside a Python wheel for Pyodide in order to provide interactive browser demos of Rust libraries.
- Distributing Go binaries like sqlite-scanner through PyPI using go-to-wheel https://simonwillison.net/2026/Feb/4/distributing-go-binarie... - I think my go-to-wheel utility is really cool, and distributing Go CLIs through PyPI is a neat trick.
- ChatGPT Containers can now run bash, pip/npm install packages, and download files https://simonwillison.net/2026/Jan/26/chatgpt-containers/ - in which I reverse engineered and documented a massive new feature of ChatGPT that OpenAI hadn't announced or documented anywhere
I remain very proud of my current open source projects too - https://datasette.io and https://llm.datasette.io and https://sqlite-utils.datasette.io and a whole lot more: https://github.com/simonw/simonw/blob/main/releases.md
Are you ready to say none of that is "novel or compelling", in good faith?
- Show HN: GenAI Prompts as "Native" Programs
-
Don't fall into the anti-AI hype
My great regret from the past few years is that experimenting with LLMs has been such a huge distraction from my other work! My https://llm.datasette.io/ tool is from that era though, and it's pretty cool.
-
Ask HN: What's a standard way for apps to request text completion as a service?
When I'm writing a script that requires some kind of call to an LLM, I use this: https://github.com/simonw/llm.
- You Don't Need to Spend $100/Mo on Claude Code:Your Guide to Local Coding Models
What are some alternatives?
koboldcpp - Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
aichat - All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
SillyTavern - LLM Frontend for Power Users.
jan - Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.
textgen - Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.