Build Personal ChatGPT Using Your Data

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • private-gpt

    Interact with your documents using the power of GPT, 100% privately, no data leaks

  • When running a Mac with Intel hardware (not M1), you may run into clang: error: the clang compiler does not support '-march=native' during pip install.

    If so set your archflags during pip install. eg: ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt

    https://github.com/imartinez/privateGPT#mac-running-intel

  • PdfGptIndexer

    An efficient tool for indexing and searching PDF text data using OpenAI API and FAISS (Facebook AI Similarity Search) index, designed for rapid information retrieval and superior search accuracy.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • gpt4all

    gpt4all: run open-source LLMs anywhere

  • I assume this is the link: https://github.com/nomic-ai/gpt4all ?

  • gpt-2

    Code for the paper "Language Models are Unsupervised Multitask Learners"

  • txtai

    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

  • paperai

    📄 🤖 Semantic search and workflows for medical/scientific papers

  • https://github.com/neuml/paperai

    Disclaimer: I am the author of both

  • instructor-embedding

    [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

  • If you look at a embeddings leaderboard [1], one of the top competitors called InstructorXL [2] is just a pip install away. It's neck and neck with Ada v2 except for a shorter input length and half the dimensions, with the added benefit that you'll always have the model available.

    Most of the other options just work with the transformers library.

    [1] https://huggingface.co/spaces/mteb/leaderboard

    [2] https://github.com/HKUNLP/instructor-embedding

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • OpenLLM

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint, locally and in the cloud.

  • vlite

    fast vector database made in numpy

  • I am working on a simple vector db just with numpy: https://github.com/sdan/vlite

    I think milvus, quickwit, and pinecone are geared more towards enterprise and are hard to use.

  • easydiffusion

    Easiest 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and see the generated image.

  • Easiest 1-click way to install and use Stable Diffusion on your computer."

    https://github.com/easydiffusion/easydiffusion

    And while Whisper is OpenAI, it is trivial to use locally and extremely usefull

    https://github.com/chidiwilliams/buzz

  • buzz

    Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

  • Easiest 1-click way to install and use Stable Diffusion on your computer."

    https://github.com/easydiffusion/easydiffusion

    And while Whisper is OpenAI, it is trivial to use locally and extremely usefull

    https://github.com/chidiwilliams/buzz

  • openai-cookbook

    Examples and guides for using the OpenAI API

  • Please provide this reference in your readme / blog as it is the original source for your work... and provides the background for the tradeoff between fine-tuning vs ask-search.

    https://github.com/openai/openai-cookbook/blob/main/examples...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts