Our great sponsors
-
qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Thanks for sharing the code. What happen when the existing content get updated and new contents created, would it need to create embeddings for all contents again? The current approach is not good as create embeddings cost money? Please see https://github.com/mpaepper/content-chatbot/blob/main/create.... Would it be possible progressively update the vector store?
Please advise. Thank you.
Looks interesting! Have you considered a proper vector database like Qdrant (https://qdrant.tech)? FAISS runs on a single machine, but if you want to scale things up, then a real database makes it a lot easier. And with a free 1GB cluster on Qdrant Cloud (https://cloud.qdrant.io), you can store quite a lot of vectors. Qdrant is also already integrated with Langchain.
Woah, that's a huge site!
Should be fine, though, as it iterates over it, it creates embeddings and then stores them in the FAISS store (https://github.com/facebookresearch/faiss) which was created to handle a large amount of embeddings.
For the actual queries, it filters it down by the most relevant documents which are closest in the embedding space, so this should work.
Let me know how it goes!
Using something like Weaviate, which can be started in Docker with a one-liner, will give the ability to move away or toward dense vectors by concept. While doing dot product with manual code is fairly easy, using Weaviate to do the lifting (for embeddings as well) makes things super simple.
https://github.com/FeatureBaseDB/slothbot/blob/slothbot-work...
Related posts
- Boost Your Code's Efficiency: Introducing Semantic Cache with Qdrant
- Qdrant 1.8.0 - Major Performance Enhancements
- Perform Image-Driven Reverse Image Search on E-Commerce Sites with ImageBind and Qdrant
- Step-by-Step Guide to Building LLM Applications with Ruby (Using Langchain and Qdrant)
- Qdrant - Using FastEmbed for Rapid Embedding Generation: A Benchmark and Guide