Python information-retrieval

Open-source Python projects categorized as information-retrieval | Edit details

Top 13 Python information-retrieval Projects

  • EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

    Project mention: [Question] Best approach for Optical Character recognition on large (20MB+) photos? | | 2021-11-10

    Try easyocr or Tesseract. Both are pretty easy to use and don't need much background in OpenCV.

  • gensim

    Topic Modelling for Humans

    Project mention: Topic modelling with Gensim and SpaCy on startup news | | 2022-01-17

    For the topic modelling itself, I am going to use Gensim library by Radim Rehurek, which is very developer friendly and easy to use.

  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • haystack

    :mag: Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.

    Project mention: NLP architecture for paragraph extraction based on fixed question | | 2022-01-23

    I can suggest you check Haystack and it's Slack community. I have seen a similar discussion there.

  • ranking

    Learning to Rank in TensorFlow

    Project mention: [D] learning to Rank | | 2021-02-21

    There are many different models and loss functions used for ranking (Tensorflow Ranking offers a bunch, probably also available for Jax / Pytorch / etc., or easily convertible).

  • InvoiceNet

    Deep neural network to extract intelligent information from invoice documents.

    Project mention: Pdfsandwich | | 2021-11-06
  • pke

    Python Keyphrase Extraction module

    Project mention: Question on easing comprehension | | 2021-09-15
  • forte

    Forte is a flexible and powerful NLP builder FOR TExt. This is part of the CASL project:

    Project mention: Building Modular and Re-purposable NLP Pipelines | | 2021-03-02

    Introducing Forte, from the CASL open-source project at Petuum. Forte combines multiple NLP tools to construct an entire NLP pipeline with a few lines of python and extend them to different domains.

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • FreeDiscovery

    Web Service for E-Discovery Analytics

    Project mention: Non-subscription, non-cloud-based review software? | | 2021-08-31
  • PatZilla

    PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.

  • nalcos

    Search Git commits in natural language

    Project mention: NaLCoS: Search commit messages in your repository in natural language | | 2021-09-20
  • FinBERT-QA

    Financial Domain Question Answering with pre-trained BERT Language Model

    Project mention: Best way to approach financial statement analysis with NLP and Image Recognition? | | 2022-01-23

    Open Domain Question Answering (ODQA) using a deep transformer NLP model that has been fine tune trained on a financial domain dataset such as FiQA.


    Code and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".

    Project mention: [D] BERT-QE: Contextualized Query Expansion for Document Re-ranking (Research Paper Walkthrough) | | 2021-02-24

    ⏩ Paper Title: BERT-QE: Contextualized Query Expansion for Document Re-ranking ⏩ Paper: ⏩ Code: ⏩ Author: Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, Andrew Yates ⏩ Organisation: University of Chinese Academy of Sciences, Amazon Alexa, Institute of Software, Chinese Academy of Sciences, Max Planck Institute for Informatics

  • IP-Tracker

    Track any ip address with IP-Tracker. IP-Tracker is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracker.

    Project mention: nokta atışı ip adresi tespit etme (yorumlarda) | | 2021-03-30
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-23.

Python information-retrieval related posts


What are some of the best open-source information-retrieval projects in Python? This list will help you:

Project Stars
1 EasyOCR 13,626
2 gensim 12,834
3 haystack 3,750
4 ranking 2,404
5 InvoiceNet 1,900
6 pke 1,075
7 forte 154
8 FreeDiscovery 62
9 PatZilla 58
10 nalcos 48
11 FinBERT-QA 43
12 BERT-QE 32
13 IP-Tracker 31
Find remote jobs at our new job board There are 29 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Static code analysis for 29 languages.
Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.