Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
similarity-search-kit
🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.
-
cozo
A transactional, relational-graph-vector database that uses Datalog for query. The hippocampus for AI!
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
If you want to play with a vector database and already use postgres, there's pgvector[0]. Pretty easy to spin up and Supabase wrote a solid tutorial[1] (you don't need to run it on Supabase).
0 - https://github.com/pgvector/pgvector
1 - https://supabase.com/blog/openai-embeddings-postgres-vector
Why not choose an open-source solution https://github.com/milvus-io/milvus, free!
With 4B vectors, you can look at methods like quantization and compression, both detailed here for Faiss - https://github.com/facebookresearch/faiss/wiki/Indexing-1G-v...
Elasticsearch uses HNSW, not sure what options they have but quantization/compression will help reduce disk storage requirements. Alternatively, you can look at dimensionality reduction algorithms and only store that output in ES. Or pick a model with a small number of dimensions. For example https://huggingface.co/sentence-transformers/all-MiniLM-L6-v... only has 384 dims vs 768/1024/2048/4096.
Well, to work on the core of the Qdrant engine https://github.com/qdrant/qdrant you should have some db knowledge but even more important are Rust skills. However, we have also other products, like the cloud platform https://cloud.qdrant.io there we are looking for different skills.
In the ANN benchmarks Elastic sets the bottom bar afaict.http://ann-benchmarks.com/
The Coral Project [0] (commenting platform used on Washington Post, New York Times, The Verge) uses an Apache 2.0 license [1]. Which doesn't seem to have prevented it from raking in big SaaS customers.
A lot of people worry about copy-cat services, but it's kind of rare that someone will be able to compete with you as the original in hosting your own service as well as you can. Especially when you consider support and maintenance requirements of a new product you aren't personally developing.
I could see copy-cat services being more of an issue in the late stage of a product though? When everyone knows lots about how to stand it up and use it?
[0] https://coralproject.net/
If anyone is looking for a vector search engine, see here https://github.com/marqo-ai/marqo. Has additional functionality to make vector search much easier.
In the quest for ultimate speed, I started developing a vector database in assembly using gpt4 as a side project https://github.com/jn2clark/GPT4Memory.
After working through several projects that utilized local hnswlib and different databases for text and vector persistence, I integrated open source hnswlib with sqlite to create an embedded vector search engine that can easily scale up to millions of embeddings. For self-hosted situations of under 10M embeddings and less than insane throughput I think this combo is hard to beat.
https://github.com/jiggy-ai/hnsqlite
Also plugging my crappy vector database, which you probably shouldn't use for anything but a fun project, however it can be set up and used in seconds. https://github.com/corlinp/Victor
Are they forking Lucene or somehow getting the Lucene devs to increase that limit? Because this PR has been open for over a year now: https://github.com/apache/lucene/issues/11507
No - they just did something in Elasticsearch to make their own FieldType https://github.com/elastic/elasticsearch/pull/95257
Chroma runs on Windows since I believe it's just a python package: https://github.com/chroma-core/chroma
How are you guys thinking about the embedding generation side of things? It seems like that part has a generally hefty compute cost before it even gets into the index - I just open sourced a swift package to try to make that part as easy as possible, the example project exports directly to pinecone. https://github.com/ZachNagengast/similarity-search-kit
If anyone wants to try a FOSS vector-relational-graph hybrid database for more complicated workloads than simple vector search, here it is: https://github.com/cozodb/cozo/
About the integrated vector search: https://docs.cozodb.org/en/latest/releases/v0.6.html
It also does duplicate detection (Minhash-LSH) and full-text search within the query language itself: https://docs.cozodb.org/en/latest/releases/v0.7.html
HN discussion a few days ago: https://news.ycombinator.com/item?id=35641164
Disclaimer: I wrote it.