Do we need a specialized vector database?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • txtai

    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

  • There isn't a best universal choice for all situations. If you're already using Postgres and all you want is to add vector search, pgvector might be good enough.

    txtai (https://github.com/neuml/txtai) sets out to be an all-in-one embeddings database. This is more than just being a vector database with semantic search. It can embed text into vectors, run LLM workflows, has components for sparse/keyword indexing and graph based search. It also has a relational layer built-in for metadata filtering.

    txtai currently supports SQLite/DuckDB for relational data but can be extended. For example, relational data could be stored in Postgres, sparse/dense vectors in Elasticsearch/Opensearch and graph data in Neo4j.

    I believe modular solutions like this where internal components can be swapped in and out are the best option but given I'm the author of txtai, I'm a bit biased. This setup enables the scaling and reliability of existing solutions balanced with someone being able get started quickly with a POC to evaluate the use case.

  • ann-benchmarks

    Benchmarks of approximate nearest neighbor libraries in Python

  • The article makes the argument that it is easier to query vector spaces when your database already supports them: why use an external vector db when you can use pgvector in postgreSQL ?

    That is a fine argument if you don't mind that pgvector is second-to-worst amongst all open-source vector search implementations, and two orders of magnitude slower than the state of the art [1].

    The author also makes the argument that traditional DBs are better because they are battle-tested, and then goes and rewrites the pgvector plugin from C to rust.

    [1] http://ann-benchmarks.com

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts