OCR software that works?

This page summarizes the projects mentioned and recommended in the original post on /r/datacurator

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • naps2

    Scan documents to PDF and more, as simply as possible.

  • The naps2 suggestion (https://www.naps2.com) is using tesseract as the OCR backend, so OP might need to look at that for adding the language data to it.

  • paperless-ngx

    A community-supported supercharged version of paperless: scan, index and archive all your physical documents

  • I am using Paperless NGX ( https://github.com/paperless-ngx/paperless-ngx ). It is a lot more than only an OCR software, but it works without problems and can also do batch ingestion. Maybe it fits your needs.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • tesseract

    Tesseract Open Source OCR Engine (main repository) (by UB-Mannheim)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts