Tesseract.js

Our great sponsors

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

Our great sponsors

Tesseract.js

32 33,214 8.2 JavaScript

Pure Javascript OCR for more than 100 Languages 📖🎉🖥
cluttr

2 - -

I made a utility that cleans up your Mac desktop and uses Tesseract to extract text from screenshots. This makes it really easy to find screenshots by searching for a line of text you remember.
https://gitlab.com/bearjaws/cluttr#readme
SurveyJS

surveyjs.io
sponsored

Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
binaryen

14 7,087 9.8 WebAssembly

Optimizer and compiler/toolchain library for WebAssembly

Emscripten can target both WebAssembly and JavaScript. The JavaScript option uses wasm2js - it compiles first to wasm, then compiles that to JS.
https://github.com/WebAssembly/binaryen#wasm2js
The emcc flag -sWASM=0 disables the wasm final output and emits JS instead.
tesseract.js-core

1 147 6.1 JavaScript

Emscripten port of Tesseract C++ API

It's annoying to find out the actual code that does the OCR is not in this repo after looking through the entire thing. It's just a bunch of scheduling and worker logic and for some reason the JS is written twice once for the browser and once for Node.
The actual code that does the OCR is wraped and included via this package [0] which just wraps the original Tesseract in C++ [1] using wasm. Shameful title.
[0] https://github.com/naptha/tesseract.js-core
[1] https://github.com/jeromewu/tesseract
tesseract

1 4 0.0 C++

Tesseract Open Source OCR Engine (main repository) (by jeromewu)

It's annoying to find out the actual code that does the OCR is not in this repo after looking through the entire thing. It's just a bunch of scheduling and worker logic and for some reason the JS is written twice once for the browser and once for Node.
The actual code that does the OCR is wraped and included via this package [0] which just wraps the original Tesseract in C++ [1] using wasm. Shameful title.
[0] https://github.com/naptha/tesseract.js-core
[1] https://github.com/jeromewu/tesseract
EasyOCR

38 21,795 4.6 Python

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

I've had good results with EasyOCR, much better than Tesseract. I agree with you, Tesseract has performed very poorly in my experience.
https://github.com/JaidedAI/EasyOCR

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project