ColBERT
haystack
ColBERT | haystack | |
---|---|---|
4 | 55 | |
2,524 | 13,883 | |
7.0% | 4.3% | |
8.4 | 9.9 | |
about 1 month ago | 5 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ColBERT
-
Why Vector Compression Matters
I’ll conclude by explaining how vector compression relates to ColBERT, a higher-level technique that Astra DB customers are starting to use successfully.
-
How ColBERT Helps Developers Overcome the Limits of Retrieval-Augmented Generation
ColBERT is a new way of scoring passage relevance using a BERT language model that substantially solves the problems with DPR. This diagram from the first ColBERT paper shows why it’s so exciting:
- FLaNK Stack 05 Feb 2024
-
New free tool that uses fine-tuned BERT model to surface answers from research papers
ColBERT and successors for retrieval.
haystack
-
Haystack DB – 10x faster than FAISS with binary embeddings by default
I was confused for a bit but there is no relation to https://haystack.deepset.ai/
-
Release Radar • March 2024 Edition
View on GitHub
-
First 15 Open Source Advent projects
4. Haystack by Deepset | Github | tutorial
-
Generative AI Frameworks and Tools Every Developer Should Know!
Haystack can be classified as an end-to-end framework for building applications powered by various NLP technologies, including but not limited to generative AI. While it doesn't directly focus on building generative models from scratch, it provides a robust platform for:
-
Best way to programmatically extract data from a set of .pdf files?
But if you want an API that you can use to develop your own flow, Haystack from Deepset could be worth a look.
-
Which LLM framework(s) do you use in production and why?
Haystack for production. We cannot afford breaking changes in our production apps. Its stable, documentation is excellent and did I mention its' STABLE!??
- Overview: AI Assembly Architectures
-
Llama2 and Haystack on Colab
I recently conducted some experiments with Llama2 and Haystack (https://github.com/deepset-ai/haystack), the NLP/LLM framework.
The notebook can be helpful for those trying to load Llama2 on Colab.
1) Installed Transformers from the main branch (and other libraries)
- Build with LLMs for production with Haystack – has 10k stars on GitHub
- Show HN: Haystack – Production-Ready LLM Framework
What are some alternatives?
qdrant - Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
langchain - 🦜🔗 Build context-aware reasoning applications
similarity - TensorFlow Similarity is a python package focused on making similarity learning quick and easy.
langchain - ⚡ Building applications with LLMs through composability ⚡ [Moved to: https://github.com/langchain-ai/langchain]
elasticsearch-learning-to-rank - Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch
gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
Milvus - A cloud-native vector database, storage for next generation AI applications
BentoML - The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
awesome-semantic-search - A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.
label-studio - Label Studio is a multi-type data labeling and annotation tool with standardized output format
history_rag
jina - ☁️ Build multimodal AI applications with cloud-native stack