Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 vector-search Open-Source Projects
-
Project mention: Unlock Advanced Search Capabilities with Milvus and Read about RAG | dev.to | 2024-03-22
Get started with Milvus on GitHub.
-
Typesense
Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
There are actually plenty of non-ES products that are way easier to integrate and tune (and get better results with less effort).
- Typesense (https://github.com/typesense/typesense)
- Algolia
- Google Programmable Search Engine (https://programmablesearchengine.google.com/about/)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Also compare with qdrant's Rust implementation; they tout their performance. https://github.com/qdrant/qdrant/tree/master/lib/segment/src...
-
Weaviate
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
Project mention: pgvecto.rs alternatives - qdrant and Weaviate | libhunt.com/r/pgvecto.rs | 2024-03-13 -
orama
🌌 Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!
Project mention: Sky's the Limit! Supercharging Your Astro Blog with Orama, the Ultimate Stargazing Search Engine! | dev.to | 2023-08-03Let's break into the steps to utilize Orama and analyze how it works. I won't dig into the technical stuff because, hey, it's an open-source project, which means you can easily peek at the source code, no problemo!
-
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
-
txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Project mention: Ask HN: What are the drawbacks of caching LLM responses? | news.ycombinator.com | 2024-03-15
Just found this: https://github.com/zilliztech/GPTCache which seems to address this idea/issue.
-
Vespa(4.3k ⭐) → A fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query. Integrated machine-learned model inference allows you to apply AI to make sense of your data in real time.
-
SPTAG
A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario.
-
Resume-Matcher
Resume Matcher is an open source, free tool to improve your resume. It works by using language models to compare and rank resumes with job descriptions.
GitHub: https://github.com/srbhr/Resume-Matcher Website: https://www.resumematcher.fyi/ Discord: Resume Matcher's Discord Tech Stack: Python, NextJS, FastAPI, TypeScript
-
superduperdb
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
-
We (Marqo) are doing a lot on 1 and 2. There is a huge amount to be done on the ML side of vector search and we are investing heavily in it. I think it has not quite sunk in that vector search systems are ML systems and everything that comes with that. I would love to chat about 1 and 2 so feel free to email me (email is in my profile). What we have done so far is here -> https://github.com/marqo-ai/marqo
-
Project mention: Transforming Postgres into a Fast OLAP Database | news.ycombinator.com | 2024-02-07
You're right. We're working on this currently. You can track the issue here: https://github.com/paradedb/paradedb/issues/717
-
vault-ai
OP Vault ChatGPT: Give ChatGPT long-term memory using the OP Stack (OpenAI + Pinecone Vector Database). Upload your own custom knowledge base files (PDF, txt, epub, etc) using a simple React frontend.
Project mention: I built an open source website that lets you upload large files, such as in-depth novels/ebooks or academic papers, and ask GPT4 questions based on your specific knowledge base. So far, I've tested it with long books like the Odyssey and random research PDFs, and I'm shocked at how incisive it is. | /r/ChatGPT | 2023-08-05 -
-
hora
🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 .
Project mention: Building a Vector Database with Rust to Make Use of Vector Embeddings | /r/rust | 2023-06-01We have been playing around with Hora as a replacement for the Rust-CV implementation as we want PQ as well. I'll check out instanct-distance, looks very interesting!
-
-
usearch
Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
Project mention: USearch SQLite Extensions for Vector and Text Search | news.ycombinator.com | 2024-02-22 -
nextjs-openai-doc-search
Template for building your own custom ChatGPT style doc search powered by Next.js, OpenAI, and Supabase.
Project mention: Creating an advanced search engine with PostgreSQL | news.ycombinator.com | 2023-07-12 -
Utilizing sqlite-vss to store and query vector embeddings managed by a local SQLite database, Semantify conducts fast, precise vector searches within these embeddings to find and recommend relevant content, ensuring readers are presented with articles that truly match their interests.
-
Project mention: Show HN: SimSIMD vs. SciPy: How AVX-512 and SVE make SIMD cleaner and ML faster | news.ycombinator.com | 2023-10-07
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
vector-search related posts
- Unlock Advanced Search Capabilities with Milvus and Read about RAG
-
pgvecto.rs alternatives - qdrant and Weaviate
3 projects | 13 Mar 2024
- RAG is Dead. Long Live RAG!
- USearch SQLite Extensions for Vector and Text Search
- Are we at peak vector database?
- Qdrant, the Vector Search Database, raised $28M in a Series A round
- Ask HN: Is there any good semantic search GUI for images or documents?
-
A note from our sponsor - InfluxDB
www.influxdata.com | 28 Mar 2024
Index
What are some of the best open-source vector-search projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Milvus | 26,298 |
2 | Typesense | 17,425 |
3 | qdrant | 17,341 |
4 | Weaviate | 9,181 |
5 | orama | 7,760 |
6 | deeplake | 7,603 |
7 | txtai | 6,725 |
8 | GPTCache | 6,312 |
9 | vespa | 5,277 |
10 | SPTAG | 4,683 |
11 | Resume-Matcher | 4,394 |
12 | superduperdb | 4,229 |
13 | marqo | 4,040 |
14 | paradedb | 3,528 |
15 | vault-ai | 3,206 |
16 | gerev | 2,589 |
17 | hora | 2,545 |
18 | vearch | 1,884 |
19 | usearch | 1,540 |
20 | nextjs-openai-doc-search | 1,461 |
21 | sqlite-vss | 1,303 |
22 | awesome-vector-search | 1,228 |
23 | langchainrb | 962 |