Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
This tutorial is a good start towards getting the data from an image of a form with a known structure. I’d personally recommend using tesserocr (actual library binding, more efficient, more functionality) instead of pytesseract (requires images to be saved before processing, uses command-line options in a subprocess instead of binding to the library), but both should work (that tutorial uses pytesseract, which is also what u/Iceberg_Bart_Simpson linked to).
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- Tesserocr
- [Question] I am trying to segment the image using python.
- [Question] Working on a simple OCR program but the text from the image is returned in a backward order and it has trouble reading multiple words on a line
- Pytesseract/OCR: RuntimeError: can't start new thread when no multi-threading
- Can´t get part of this REGEX-pattern to work?