ragrank
ragflow
ragrank | ragflow | |
---|---|---|
1 | 10 | |
23 | 9,054 | |
- | 32.4% | |
9.5 | 9.8 | |
21 days ago | 6 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ragrank
-
I created Ragrank 🎯- An open source ecosystem to evaluate LLM and RAG.
Feel free to contribute on GitHub 💚
ragflow
-
Better RAG Results with Reciprocal Rank Fusion and Hybrid Search
Within our open source RAG product RAGFlow(https://github.com/infiniflow/ragflow), Elasticsearch is currently used instead of other general vector databases, because it can provide hybrid search right now. Under the default cases, embedding based reranker is not required, just RRF is enough, while even if reranker is used, keywords based retrieval is also a MUST to be hybridized with embedding based retrieval, that's just what RAGFlow's latest 0.7 release has provided.
On the other hand let me introduce another database we developed, Infinity(https://github.com/infiniflow/infinity), which can provide the fastest hybrid search, you can see the performance here(https://github.com/infiniflow/infinity/blob/main/docs/refere...), both vector search and full-text search could perform much faster than other open source alternatives.
From the next version(weeks later), Infinity will also provide more comprehensive hybrid search capabilities, what you have mentioned the 3-way recalls(dense vector, sparse vector, keyword search) could be provided within single request.
- Integrated Rerankers, implemented RAPTOR, RAGFlow 0.7 released
-
Ask HN: RAG and unstructured data from several docs
There are numerous strategies and methods available to enhance RAG performance, particularly when it comes to improving performance in parsing vast amounts of unstructured data. Additionally, various scenarios call for different parsing techniques. I would suggest exploring a RAG project that excels in document parsing: https://github.com/infiniflow/ragflow
- DeepSeek-V2 integrated, RAGFlow v0.5.0 is released
-
RAGFlow is an open-source RAG engine based on deep document understanding
Just link them to https://github.com/infiniflow/ragflow/blob/main/rag/llm/chat... :)
What are some alternatives?
unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
txtai - 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows