OCR with Python

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  • ocrmypdf is what I’d normally suggest if you’re wanting to just apply OCR to an entire PDF of scanned pages.

  • tesserocr

    A Python wrapper for the tesseract-ocr API

  • If you have an electronically created pdf (not scanned) and you’re just wanting to run OCR on embedded images then you’ll want a pdf library that can extract the figure images for you, and then you can use tesserocr to run OCR on those images.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Optimizing ImageGrab and pytesseract

    3 projects | /r/learnpython | 8 May 2021
  • When Will the GenAI Bubble Burst?

    1 project | news.ycombinator.com | 4 Apr 2024
  • A better document viewer

    1 project | /r/linux4noobs | 13 Sep 2023
  • Gibts ein (CLI) tool, das Kontrast und Helligkeit von gescannten Textdokumenten dynamisch anpasst?

    3 projects | /r/de_EDV | 27 Jun 2023
  • OCR for a full pdf on Neoreader

    1 project | /r/Onyx_Boox | 25 Jun 2023