Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
You can take advantage of the fact that many of these sentences aren't even in the same neighbourhood by using techniques like locally sensitive hashing or FAISS to reduce the number of operations that are done up front.
Nearest neighbour search isn't O(N²), neither is building the index. If you had a machine with enough RAM, then I would recommend scann, as it works well and is incredibly fast. I'm not sure if it works with an on-disk file format, though that's what you would want.
Yeah you can use elasticsearch to do some of the heavy lifting when indexing vectors although I've not personally used it for this exsct use case. I think tools like jina and haystack make this super easy for you
Yeah you can use elasticsearch to do some of the heavy lifting when indexing vectors although I've not personally used it for this exsct use case. I think tools like jina and haystack make this super easy for you