EasyOCR
unilm
EasyOCR | unilm | |
---|---|---|
38 | 40 | |
22,049 | 18,358 | |
2.0% | 1.7% | |
3.6 | 9.0 | |
about 1 month ago | 11 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
EasyOCR
-
Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide
PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]
- OCR a lot of hand written invoice and records?
-
[P] EasyOCR in C++!
I just uploaded my C++ implementation of EasyOCR, a well known ocr library for python. Also dusted some cobwebbs from some audio related projects as well, feel free to leave feedback or contribute! I only implemented the most salient parts, so certainly could use some community help! Cheers!
-
OCR at Edge on Cloudflare Constellation
EasyOCR is a popular project if you are in an environment where you can use run Python and PyTorch (https://github.com/JaidedAI/EasyOCR). Other open source projects of note are PaddleOCR (https://github.com/PaddlePaddle/PaddleOCR) and docTR (https://github.com/mindee/doctr).
-
Donut: OCR-Free Document Understanding Transformer
The main one was https://github.com/JaidedAI/EasyOCR, mostly because, as promised, it was pretty easy to use, and uses pytorch (which I preferred in case I wanted to tweak it). It has been updated since, but at the time it was using CRNN, which is a solid model, especially for the time - it wasn't (academic) SOTA but not far behind that. I'm sure I could've coaxed better performance than I got out of it with some retraining and hyperparameter tuning.
-
Help with OCR of pixel-y numbers
Anyways, you can give a shot to EasyOCR, pretty solid and flexible
- How to perform document OCR?
-
Python unexpectedly quits (macOS ventura, M1)
The easyocr library: https://github.com/JaidedAI/EasyOCR
- I made a website for a friend who owns a restaurant. He's wondering if there's a way to upload a picture of his menu daily. What is the best way to do this?
-
Raspberry Pi Easyocr
Not used it on a Pi but maybe a Docker version (if there is one) would run? Compose file here
unilm
- The Era of 1-Bit LLMs: Training_Tips, Code And_FAQ [pdf]
- The Era of 1-Bit LLMs: Training Tips, Code and FAQ
-
The Era of 1-bit LLMs: ternary parameters for cost-effective computing
+1 On this, the real proof would have been testing both models side-by-side.
It seems that it may be published on GitHub [1] according to HuggingFace [2].
[1] https://github.com/microsoft/unilm/tree/master/bitnet
[2] https://huggingface.co/papers/2402.17764
- I'm an Old Fart and AI Makes Me Sad
-
On building a semantic search engine
e5-mistral is essentially a distillation from gpt-4 to a smaller model. You can see here https://github.com/microsoft/unilm/blob/16da2f193b9c1dab0a69...
they actually have custom prompts for each dataset being tested.
Question would be, if you haven't seen the task before, what is a good prompt to prepend for your task?
IMO e5-mistral is overfit to MTEB
-
Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide
Layout LM v1, v2 and v3 models [ Github ] DocBERT [ Github ]
-
Microsoft Publishes LongNet: Scaling Transformers to 1,000,000,000 Tokens
The repository is available here.
-
Recommended open LLMs with image input modality?
It is missing kosmos-2. I remember its image captioning was(demo currently down) really good and it's almost as fast as llava and lavin.
-
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Should be this: https://github.com/microsoft/unilm/
-
[R] LongNet: Scaling Transformers to 1,000,000,000 Tokens
This is from Microsoft Research (Asia). https://aka.ms/GeneralAI
What are some alternatives?
PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
ERNIE - Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
doctr - docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
involution - [CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
OpenCV - Open Source Computer Vision Library
gensim - Topic Modelling for Humans
awesome-colab-notebooks - Collection of google colaboratory notebooks for fast and easy experiments
maelstrom - A workbench for writing toy implementations of distributed systems.
tesserocr - A Python wrapper for the tesseract-ocr API
rasa - 💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants