ollama
khoj
| ollama | khoj | |
|---|---|---|
| 750 | 53 | |
| 173,924 | 34,892 | |
| 2.0% | 1.9% | |
| 9.9 | 9.7 | |
| about 13 hours ago | 3 months ago | |
| Go | Python | |
| MIT License | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ollama
-
Set Up Your Own ChatGPT: Ollama + Open WebUI for Data That Never
Download: Go to https://ollama.com/ and click on the download link for your operating system.
-
I Built a Free, Fully Local AI Resume Builder — No Subscriptions, No Cloud, No Catch
Most AI resume tools call out to OpenAI or Anthropic and charge you for every request. Persona supports Ollama — which means you can run the AI model locally on your own hardware, with zero API costs and zero data leaving your machine.
-
Sovereign Synapse: The Local Brain
To solve these, we built a stack that prioritizes integrity over ease. The centerpiece is Ollama, running the mxbai-embed-large model locally. This is the engine that translates human thought into high-dimensional coordinates.
-
How I Built a Self-Funding AI Lab: From Hobby to Side Income in 6 Months
Ollama for model serving
-
Flat Chat Threads Suck for Reading Books. So I Built a Local-First AI Tree Companion.
Fully offline: Point it at Ollama or LM Studio. Zero cost, nothing leaves your network.
-
Local LLM Hardware Requirements in 2026: What You Actually Need for Every Model Tier [Guide]
Recommended hardware: The RTX 3060 with 12 GB VRAM is the budget king here — all these models fit with room to spare for KV cache overhead, even Gemma 4:12B (which needs ~8.5–9 GB with overhead). An RTX 4060 Ti 16 GB gives you more headroom. On the Apple side, any M2 or M3 MacBook with 16 GB unified memory handles these models comfortably via Ollama's Metal backend.
-
Run Coding Agents on Local AI — Zero Cloud, Full Control
This guide shows how to swap out every cloud API with a local Ollama server running qwen3-coder:30b. Same tools, same workflows, no data leaving your network.
-
Running Brand-New Gemma 4 12B on an 8-Year-Old GTX 1080 Ti: Speed, 3 Gotchas, and Why Q8 Beat Q4 on My Own Field
Related: 35B MoE on 2× 1080 Ti · Ollama
-
Agent Skills in Microsoft Agent Framework
The sample is a tiny console app running entirely against a local Ollama model — no cloud keys, and every HTTP call is traced so I can see exactly what goes over the wire (complete sample code). There's a single skill on disk:
-
Quick and easy local AI RAG setup with JetBrains IDE integration and browser UI
irm https://ollama.com/install.ps1 | iex
khoj
- 25 Trending Self-Hosted Projects on GitHub
-
We Scanned 16 AI Agent Repos. 76% of Tool Calls Had Zero Guards.
A concrete example from our scan: Khoj, an open-source AI assistant, exposes a function called ai_update_memories that lets the LLM delete and replace user memories. It calls session.delete() followed by session.add() with no confirmation, no rate limit, and no validation on the content. A single adversarial prompt could wipe and replace a user's entire memory store.
-
Top 13 Self-Hosted Projects with the Most GitHub Stars
GitHub https://github.com/khoj-ai/khoj GitHub Star 12.4k GitHub Fork 627 GitHub Issue 64 GitHub Pull Request 3 GitHub Contributor 35 Open Source License AGPL-3.0 Official Website https://khoj.dev/ Documentation https://docs.khoj.dev/
-
Show HN: I made an app to use local AI as daily driver
There are already several RAG chat open source solutions available. Two that immediately come to mind are:
Danswer
https://github.com/danswer-ai/danswer
Khoj
https://github.com/khoj-ai/khoj
-
Ask HN: How do I train a custom LLM/ChatGPT on my own documents in Dec 2023?
I'm a fan of Khoj. Been using it for months. https://github.com/khoj-ai/khoj
-
You probably don’t need to fine-tune LLMs
https://github.com/khoj-ai/khoj
This is the easiest I found, on here too.
-
Show HN: Khoj – Chat Offline with Your Second Brain Using Llama 2
Thanks for the feedback. Does your machine have a GPU? 32GB CPU RAM should be enough but GPU speeds up response time.
We have fixes for the seg fault[1] and improvement to the query speed[2] that should be released by end of day today[3].
Update khoj to version 0.10.1 with pip install --upgrade khoj-assistant to see if that improves your experience.
The number of documents/pages/entries doesn't scale memory utilization as quickly and doesn't affect the search, chat response time as much
[1]: The seg fault would occur when folks sent multiple chat queries at the same time. A lock and some UX improvements fixed that
[2]: The query time improvements are done by increasing batch size, to trade-off increased memory utilization for more speed
[3]: The relevant pull request for reference: https://github.com/khoj-ai/khoj/pull/393
-
A Review: Using Llama 2 to Chat with Notes on Consumer Hardware
We recently integrated Llama 2 into Khoj. I wanted to share a short real-world evaluation of using Llama 2 for the chat with docs use-cases and hear which models have worked best for you all. The standard benchmarks (ARC, HellaSwag, MMLU etc.) are not tuned for evaluating this
What are some alternatives?
koboldcpp - Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
obsidian-smart-connections - Find related notes and excerpts while writing. Your link building copilot displays relevant content in graph + list view. A local embedding model powers semantic search. Zero setup. No API key.
SillyTavern - LLM Frontend for Power Users.
onyx - Open Source AI Platform - AI Chat with advanced features that works with every LLM
textgen - Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.
logseq-plugin-gpt3-openai - A plugin for GPT-3 AI assisted note taking in Logseq