pke
EasyOCR


pke | EasyOCR | |
---|---|---|
3 | 40 | |
1,570 | 25,457 | |
- | 1.8% | |
3.1 | 3.2 | |
over 1 year ago | 5 months ago | |
Python | Python | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pke
- Question on easing comprehension
-
[P] Building model to extract keywords from legal documents
Look into rake, pke, phrasemachine, pyate, keybert.
-
Best approach for automatic key word extraction
There are lots of off-the-shelf tools for this. Look into: - https://github.com/boudinfl/pke - https://github.com/kevinlu1248/pyate - https://github.com/zelandiya/RAKE-tutorial - https://github.com/slanglab/phrasemachine - https://github.com/MaartenGr/KeyBERT/
EasyOCR
-
Decoding OCR: A Comprehensive Guide
https://github.com/JaidedAI/EasyOCR
-
I built an online PDF management platform using open-source software
Ok on cleaned aligned data, but there are a few newer ones like EasyOCR [0] that can deal with much less organized text (albeit more slowly)
[0] https://github.com/JaidedAI/EasyOCR
-
Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide
PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]
- OCR a lot of hand written invoice and records?
-
[P] EasyOCR in C++!
I just uploaded my C++ implementation of EasyOCR, a well known ocr library for python. Also dusted some cobwebbs from some audio related projects as well, feel free to leave feedback or contribute! I only implemented the most salient parts, so certainly could use some community help! Cheers!
-
OCR at Edge on Cloudflare Constellation
EasyOCR is a popular project if you are in an environment where you can use run Python and PyTorch (https://github.com/JaidedAI/EasyOCR). Other open source projects of note are PaddleOCR (https://github.com/PaddlePaddle/PaddleOCR) and docTR (https://github.com/mindee/doctr).
-
Donut: OCR-Free Document Understanding Transformer
The main one was https://github.com/JaidedAI/EasyOCR, mostly because, as promised, it was pretty easy to use, and uses pytorch (which I preferred in case I wanted to tweak it). It has been updated since, but at the time it was using CRNN, which is a solid model, especially for the time - it wasn't (academic) SOTA but not far behind that. I'm sure I could've coaxed better performance than I got out of it with some retraining and hyperparameter tuning.
-
Help with OCR of pixel-y numbers
Anyways, you can give a shot to EasyOCR, pretty solid and flexible
- How to perform document OCR?
-
Python unexpectedly quits (macOS ventura, M1)
The easyocr library: https://github.com/JaidedAI/EasyOCR
What are some alternatives?
KeyBERT - Minimal keyword extraction with BERT
doctr - docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
textstat - :memo: python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
retext-readability - plugin to check readability
OpenCV - Open Source Computer Vision Library
pytextrank - Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
rake-nltk - Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
tesserocr - A Python wrapper for the tesseract-ocr API
phrasemachine - Quickly extract multi-word phrases from a corpus
LaTeX-OCR - pix2tex: Using a ViT to convert images of equations into LaTeX code.

