[P] Library for end-to-end neural search pipelines

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • cherche

    Neural Search

    Github link Documentation Hackernews link

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • faiss

    A library for efficient similarity search and clustering of dense vectors.

    Cherche is compatible with large corpora like wikipedia for example and provides decent response times notebook. I will benchmark Jina and Haystack under the same conditions but there shouldn't be much difference as the responsibility falls on Elasticsearch or on Faiss via retrieve.Encoder. The strength of Cherche is the fancy pipelines composed of union and intersection operations between TfIdf BM25, Flash, multiple rankers... and this is less adapted to Wikipedia. It is more adapted to industrial or personal corpora (<= 100000 documents).

  • flashtext

    Extract Keywords from sentence or Replace keywords in sentences.

    I started developing this tool after using haystack. Pipelines are easier to build with cherche because of the operators. Also, cherche offers FlashText, Lunr.py retrievers that are not available in Haystack and that I needed for the project I wanted to solve. Haystack is clearly more complete but I think also more complex to use.

  • lunr.py

    A Python implementation of Lunr.js 🌖

    I started developing this tool after using haystack. Pipelines are easier to build with cherche because of the operators. Also, cherche offers FlashText, Lunr.py retrievers that are not available in Haystack and that I needed for the project I wanted to solve. Haystack is clearly more complete but I think also more complex to use.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [P] what is the most efficient way to pattern matching word-to-word?

    2 projects | /r/MachineLearning | 1 Jun 2022
  • What is the most efficient way to find substrings in strings?

    1 project | /r/learnpython | 11 Jan 2022
  • How can I speed up thousands of re.subs()?

    1 project | /r/learnpython | 12 Nov 2021
  • My first NLP pipeline using SpaCy: detect news headlines with company acquisitions

    2 projects | /r/Python | 8 Oct 2021
  • What tech do I need to learn to programmatically parse ingredients from a recipe?

    1 project | /r/LanguageTechnology | 5 Sep 2021

Did you konow that Python is
the 2nd most popular programming language
based on number of metions?