Pytesseract/OCR: RuntimeError: can't start new thread when no multi-threading

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • tesserocr

    A Python wrapper for the tesseract-ocr API

  • If you want a suggestion, use tesserocr instead of Pytesseract. It’s an actual binding to the tesseract library (Python talks to it directly, instead of calling a program as a subprocess), which means it runs more efficiently, you can process multiple images sequentially with the same OCR engine (pytesseract has to start a process and a new engine for every image that gets processed), you get access to more functionality options, and a bunch of other beneficial stuff. If you’re doing preprocessing with OpenCV it’s even possible to pass those arrays directly to tesseract in memory, whereas Pytesseract requires that you save each image to a file before it can process it.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts