Show HN: Khoj – Chat Offline with Your Second Brain Using Llama 2

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • khoj

    Your AI second brain. A copilot to get answers to your questions, whether they be from your own notes or from the internet. Use powerful, online (e.g gpt4) or private, local (e.g mistral) LLMs. Self-host locally or use our web app. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.

  • Yeah, the gpt4all project is super neat. If folks are inclined enough, it should be fairly straightforward for you to clone the Khoj project and swap out the model used. You'd have to update the model type in a few places, but should be easy enough just with normal string/keyword search. Then run it directly from inside your machine. You will, however, have to go in and modify the prompt structure to fit the model's expectation. Some guidance on that in this PR with Falcon: https://github.com/khoj-ai/khoj/pull/330/files#diff-7fa11396...

    I'll provide my insight from experimentation integrating Llama V2/GPT4All into Khoj -- Falcon 7b is probably the runner up in models that can be supported on consumer hardware, and it really wasn't good enough (for me) on my machine to be useful. The token consumption with personal notes context is too large, and the content too variable for a small model like that to be able to understand it. It's fine if you're just doing normal question-answering back and forth, but you don't need Khoj for that.

  • llama-cpp-python

    Python bindings for llama.cpp

  • I see you’re using gpt4all; do you have a supported way to change the model being used for local inference?

    A number of apps that are designed for OpenAI’s completion/chat APIs can simply point to the endpoints served by llama-cpp-python [0], and function in (largely) the same way, while supporting the various models and quants supported by llama.cpp. That would allow folks to run larger models on the hardware of their choice (including Apple Silicon with Metal acceleration) or using other proxies like openrouter.io.

    [0]: https://github.com/abetlen/llama-cpp-python

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

  • This is a super cool project. Congrats! If you’re looking at trying different models with one API check out an open-source project a few folks and I have been working on in July in case it’s helpful https://github.com/jmorganca/ollama

    Llama 2 gives great answers, even the 7B model. There’s an “uncensored” 7B version as well George Sung has fine-tuned for topics that the default Llama2 model won’t discuss - eg I had trouble having Llama2 review authentication/security code or topics: https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GG...

    If you do end up checking out Ollama you can try it with or there’s an API too

      ollama run llama2-uncensored

  • ripgrep-all

    rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.

  • 1. If you want better adoption especially among corporations, GPL-3 wont cut it. Maybe think of some business friendly licenses (MIT etc)

    2. I understand the excitement about llm's. But how about making something more accessible. I use rip-grep-all (rga) along with fzf [1] that can search all files including pdfs in a specific folders. However, I would like a GUI tool to search across multiple folders, provide priority of results across folders and store and search histories where I can do a meta-search. This is sufficient for 95% of my usecases to search locally and I dont need LLM. If khoj can enable such search as default without LLM that will be a gamechanger for many people without a heavy compute machine or who dont want to use OpenAI.

    [1] https://github.com/phiresky/ripgrep-all/wiki/fzf-Integration

  • How-To-Tamper-With-Any-Electron-Application

    This work-in-progress outlines known vulnerabilities in the Electron framework, and how they may be abused to create dangerous exploits.

  • I followed this guide for modifying Electron apps.

    https://github.com/jonmest/How-To-Tamper-With-Any-Electron-A...

    Obsidian is not open source so it's minified and hard to read. But I was able to find the relevant code and just set the delay to 0.

    (I'm away from computer now, I'll see if I can find the code later.)

    What also helped is that all Electron apps are just Chromium so you can run the dev tools and the debugger! I think the hotkey is F12, and/or Ctrl+Shift+J.

  • I followed this guide for modifying Electron apps.

    https://github.com/jonmest/How-To-Tamper-With-Any-Electron-A...

    Obsidian is not open source so it's minified and hard to read. But I was able to find the relevant code and just set the delay to 0.

    (I'm away from computer now, I'll see if I can find the code later.)

    What also helped is that all Electron apps are just Chromium so you can run the dev tools and the debugger! I think the hotkey is F12, and/or Ctrl+Shift+J.

  • Memacs

    What did I do on February 14th 2007? Visualize your (digital) life in Org-mode

  • Might look into some of the tools like novoids Memacs. Notion here is to build tools that push feeds, history data, into Emacs. Using org in your use case with the Khoj tool, could be the "glue" you need to tie it all together. https://github.com/novoid/Memacs#readme.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts