Frog: OCR Tool for Linux

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • normcap

    OCR powered screen-capture tool to capture information instead of images

  • tessdata

    Trained models with fast variant of the "best" LSTM models + legacy models

    Appears to be a nice wrapper around Tesseract:

    https://github.com/tesseract-ocr/tessdata

    https://en.wikipedia.org/wiki/Tesseract_(software)

    The demo of course works perfectly on a Mac as this is already built into Ventura.

      In November 2020, Brewster Kahle from the Internet Archive praised Tesseract saying:

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • doctr

    docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

    There's also DocTR which can do text detection and extraction out of the box.

    It's command line driven but can display the detected text as an overlay of the document.

    https://github.com/mindee/doctr

  • OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  • PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

    I’ve had good results from paddle ocr.

    https://github.com/PaddlePaddle/PaddleOCR

  • flameshot

    Powerful yet simple to use screenshot software :desktop_computer: :camera_flash:

    Cool! I've seen similar ideas before and made my own inspired by these some years ago. It's a simple bash script based on [flameshot](https://flameshot.org/) for taking the screenshot and Tesseract:

        #!/usr/bin/env bash

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts