Donut: OCR-Free Document Understanding Transformer

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • donut

    Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

  • OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

  • When I was evaluating options a few months ago I found https://github.com/PaddlePaddle/PaddleOCR to be a very strong contender for my use case (reading product labels), but you'll definitely want to put together some representative docs/images and test a bunch of solutions to see what works for you.

  • EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

  • The main one was https://github.com/JaidedAI/EasyOCR, mostly because, as promised, it was pretty easy to use, and uses pytorch (which I preferred in case I wanted to tweak it). It has been updated since, but at the time it was using CRNN, which is a solid model, especially for the time - it wasn't (academic) SOTA but not far behind that. I'm sure I could've coaxed better performance than I got out of it with some retraining and hyperparameter tuning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Tesseract OCR

    10 projects | news.ycombinator.com | 18 Jul 2021
  • Supervision – reusable computer vision tools

    1 project | news.ycombinator.com | 20 Mar 2024
  • Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide

    5 projects | dev.to | 27 Dec 2023
  • OCR a lot of hand written invoice and records?

    1 project | /r/selfhosted | 7 Dec 2023
  • [P] EasyOCR in C++!

    2 projects | /r/MachineLearning | 2 Dec 2023