tesseract-ocr

Tesseract Open Source OCR Engine (main repository) (by tesseract-ocr)

Tesseract-ocr Alternatives

Similar projects and alternatives to tesseract-ocr

  1. calibre

    The official source code repository for the calibre ebook manager

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. ShareX

    ShareX is a free and open source program that lets you capture or record any area of your screen and share it with a single press of a key. It also allows uploading images, text or other types of files to many supported destinations you can choose from.

  4. logseq

    562 tesseract-ocr VS logseq

    A local-first, non-linear, outliner notebook for organizing and sharing your personal knowledge base. Use it to organize your todo list, to write your journals, or to record your unique life.

  5. pandoc

    445 tesseract-ocr VS pandoc

    Universal markup converter

  6. xournalpp

    Xournal++ is a handwriting notetaking software with PDF annotation support. Written in C++ with GTK3, supporting Linux (e.g. Ubuntu, Debian, Arch, SUSE), macOS and Windows 10. Supports pen input from devices such as Wacom Tablets.

  7. OpenCV

    Open Source Computer Vision Library

  8. typst

    A new markup-based typesetting system that is powerful and easy to learn.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  11. PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

  12. EasyOCR

    42 tesseract-ocr VS EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

  13. rnote

    Sketch and take handwritten notes.

  14. Tesseract.js

    34 tesseract-ocr VS Tesseract.js

    Pure Javascript OCR for more than 100 Languages 📖🎉🖥

  15. marker

    30 tesseract-ocr VS marker

    Convert PDF to markdown + JSON quickly with high accuracy

  16. docling

    25 tesseract-ocr VS docling

    Get your documents ready for gen AI

  17. normcap

    OCR powered screen-capture tool to capture information instead of images

  18. pytesseract

    A Python wrapper for Google Tesseract

  19. gImageReader

    A Gtk/Qt front-end to tesseract-ocr.

  20. tessdata

    Trained models with fast variant of the "best" LSTM models + legacy models

  21. hsk30

    HSK 3.0 Vocabulary Lists (words and characters)

  22. SVG++

    C++ SVG library

  23. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better tesseract-ocr alternative or higher similarity.

tesseract-ocr discussion

Log in or Post with

tesseract-ocr reviews and mentions

Posts with mentions or reviews of tesseract-ocr. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-03-06.
  • Mistral OCR
    7 projects | news.ycombinator.com | 6 Mar 2025
    https://www.home-assistant.io/integrations/seven_segments/

    https://www.unix-ag.uni-kl.de/~auerswal/ssocr/

    https://github.com/tesseract-ocr/tesseract

    https://community.home-assistant.io/t/ocr-on-camera-image-fo...

    https://www.google.com/search?q=home+assistant+ocr+integrati...

    https://www.google.com/search?q=esphome+ocr+sensor

    https://hackaday.com/2021/02/07/an-esp-will-read-your-meter-...

    ...start digging around and you'll likely find something. HA has integrations which can support writing to InfluxDB (local for sure, and you can probably configure it for a remote influxdb).

    You're looking at 1xRaspberry PI, 1xUSB Webcam, 1x"Power Management / humidity management / waterproof electrical box" to stuff it into, and then either YOLO and DIY to shoot over to your influxdb, or set up a Home Assistant and "attach" your frankenbox as some sort of "sensor" or "integration" which spits out metrics and yadayada...

  • Ask HN: What is the best method for turning a scanned book as a PDF into text?
    13 projects | news.ycombinator.com | 16 Feb 2025
    Two possibilities are "top of mind" for me:

    You could script it using Gemini via the API[1].

    Or use Tesseract[2].

    [1]: https://ai.google.dev/

    [2]: https://github.com/tesseract-ocr/tesseract

  • OCR4all
    15 projects | news.ycombinator.com | 13 Feb 2025
  • OCR Solutions Uncovered: How to Choose the Best for Different Use Cases
    2 projects | dev.to | 1 Aug 2024
    Custom Integration: Developers and businesses needing flexibility for custom integration into applications and projects should consider open-source solutions like Tesseract OCR or API-based services like API4AI OCR. These options provide APIs for seamless integration into existing software systems.
  • Mastering Text Extraction from Multi-Page PDFs Using OCR API: A Step-by-Step Guide
    1 project | dev.to | 15 Jul 2024
    Tesseract OCR is an open-source OCR engine created by Google, known for its accuracy and wide language support. It is particularly favored by developers for its flexibility and the absence of licensing fees, allowing it to be integrated into various applications. However, it demands more effort to set up and utilize compared to cloud-based OCR services.
  • OCR with tesseract, python and pytesseract
    2 projects | dev.to | 4 Jun 2024
    If you want to learn more visit the complete tesseract documentation.
  • OCR Tools for Mac, iOS and Windows
    1 project | news.ycombinator.com | 3 Jun 2024
    You can use tesseract

    https://tesseract-ocr.github.io/

  • Multimodal AI: Bridging the Gap Between Human and Machine Understanding
    1 project | dev.to | 14 May 2024
    AI copilots: Copilots powered by various LLMs like Pieces Copilot can leverage computer vision technologies for inputs beyond text and code. For example, optical character recognition software at Pieces uses Tesseract as its main OCR code engine, extended with bicubic upsampling. Pieces then uses edge-ML models to auto-correct any potential defects in the resulting code/text, which users can input as prompts to the AI copilot. Pieces Copilot in its current iteration also comes with a unique tool called the Workstream Pattern Engine which gathers real-time context from any application through computer vision, enabling Pieces to understand everything on your screen and pass it through to the LLM so you can talk to the AI about it.
  • I built an online PDF management platform using open-source software
    4 projects | news.ycombinator.com | 12 May 2024
    i used open source solutions to built it, like libreoffice, ghostscript, google's tesseract and a bunch of other tools, Google's Tesseract: https://github.com/tesseract-ocr/tesseract
  • Highlighting Image Text
    1 project | dev.to | 30 Apr 2024
    We are going to be using an OCR (Optical Character Recognition) engine called Tesseract for the image-to-text recognition part. It is free software, released under the Apache License. Install the engine for your desired OS from their official website. I'm using Windows for this. Add the installation path to your environment variables.
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 21 May 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Stats

Basic tesseract-ocr repo stats
130
66,835
9.0
19 days ago

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com