Storing OpenAI embeddings in Postgres with pgvector

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • postgres

    Unmodified Postgres with some useful plugins (by supabase)

    Hey HN, this one has a cool back story with it, that really shows the power of open source.

    The author, Greg[0], wanted to use pgvector in a Postgres services, so he created a PR[1] in our Postgres repo. He then reached out and we decided it would be fun to collaborate on a project together, so he helped us build a "ChatGPT" interface for the supabase docs (which we will release tomorrow).

    This article explains all the steps you'd take to implement the same functionality yourself.

    I want to give a shout-out to pgvector too, it's a great extension [2]

    [0] Greg: https://twitter.com/ggrdson

    [1] pgvector PR: https://github.com/supabase/postgres/pull/472

    [2] pgvector: https://github.com/pgvector/pgvector

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • pgvector

    Open-source vector similarity search for Postgres

    Hey HN, this one has a cool back story with it, that really shows the power of open source.

    The author, Greg[0], wanted to use pgvector in a Postgres services, so he created a PR[1] in our Postgres repo. He then reached out and we decided it would be fun to collaborate on a project together, so he helped us build a "ChatGPT" interface for the supabase docs (which we will release tomorrow).

    This article explains all the steps you'd take to implement the same functionality yourself.

    I want to give a shout-out to pgvector too, it's a great extension [2]

    [0] Greg: https://twitter.com/ggrdson

    [1] pgvector PR: https://github.com/supabase/postgres/pull/472

    [2] pgvector: https://github.com/pgvector/pgvector

  • Milvus

    A cloud-native vector database, storage for next generation AI applications

    First time I've heard of pgvector - for folks with experience, how does it compare to other ANN plugins (i.e. Redis https://redis.io/docs/stack/search/reference/vectors/) and purpose-built vector databases (i.e. Milvus https://milvus.io)?

    Curious about both performance/QPS and scale/# of vectors.

  • txtai

    đź’ˇ All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

    You might want to check out txtai (https://github.com/neuml/txtai). It's default configuration is a FAISS index paired with a SQLite database for filtering.

    Also worth mentioning that there are plenty of other vector models to try outside of OpenAI. Many open-source and much smaller than 1536 dimensions. Check out the Hugging Face Hub (https://hf.co/models). For example this model (https://huggingface.co/sentence-transformers/all-MiniLM-L6-v...) works great in many cases and is only 384 dimensions. Runs great locally and is FOSS.

  • Typesense

    Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

    Disclaimer: I work on Typesense [1] (an open source alternative to Algolia + Pinecone) and we recently added Vector Search as a feature to Typesense [2].

    Postgres can do a lot of things, but for large enough datasets and/or when you want to add filtering into the mix along with vector search, then it becomes slow. And at that point you want to use a dedicated vector search database.

    It's similar to how Postgres can also do full text search, but for large datasets and/or you want to add typo tolerance, faceting, grouping, filtering, synonyms, etc - the usual features you'd need when implementing a search feature - then it becomes slow to do this in pg and you'd then use a dedicated search engine.

    In Typesense, we've now combined Vector Search along with filtering based on attributes in your documents, so you get the best of both worlds [2].

    [1] https://typesense.org/

  • faiss

    A library for efficient similarity search and clustering of dense vectors.

    One downside of pgvector is that it currently only supports one type of index (ivfflat), while others (FAISS, Milvus, qdrant, etc.) support other types of indices that can be advantageous depending on your workload (properties of vectors, size of dataset). See [1] for some more background.

    [1] https://github.com/facebookresearch/faiss/wiki/Guidelines-to...

  • hnswlib

    Header-only C++/python library for fast approximate nearest neighbors

    https://github.com/nmslib/hnswlib

    Used it to index 40M text snippets in the legal domain. Allows incremental adding.

    I love how it just works. You know, doesn’t ANNOY me or makes a FAISS. ;-)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • What Is a Vector Database

    22 projects | news.ycombinator.com | 5 May 2023
  • How to choose a vector database: Pinecone, Weaviate, MongoDB Atlas, SemaDB

    3 projects | dev.to | 11 Sep 2024
  • Simplifying the Milvus Selection Process

    3 projects | dev.to | 19 Feb 2024
  • 7 Vector Databases Every Developer Should Know!

    4 projects | dev.to | 8 Feb 2024
  • Milvus Adventures Dec 15, 2023

    1 project | dev.to | 15 Dec 2023

Did you konow that C++ is
the 6th most popular programming language
based on number of metions?