PaddleOCR
BoofCV
Our great sponsors
PaddleOCR | BoofCV | |
---|---|---|
60 | 20 | |
38,373 | 1,035 | |
4.5% | - | |
8.6 | 8.5 | |
3 days ago | about 2 months ago | |
Python | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PaddleOCR
-
Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide
PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]
-
What is the best repo for hand written text recognition?
My default recommendation for OCR is https://github.com/PaddlePaddle/PaddleOCR but most of the examples there are not handwritten - so I'm not sure how well it'll handle it this time.
-
Ask HN: Best way to perform complex OCR task in 2023?
Other than EasyOCR and Tesseract, PaddleOCR (https://github.com/PaddlePaddle/PaddleOCR) is probably the most well known open-source OCR solution.
What are you planning to do with the text after detecting / recognizing it? How fast does the detection / recognition need to be in order to be useful?
-
Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM
Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text
-
How would you go about driving contextual data from images?
For images with text, if you want to do visual qa, document classification, table/key information extraction, checkout https://huggingface.co/blog/document-ai https://github.com/philschmid/document-ai-transformers https://github.com/google-research/pix2struct https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/README.md
-
OCR at Edge on Cloudflare Constellation
EasyOCR is a popular project if you are in an environment where you can use run Python and PyTorch (https://github.com/JaidedAI/EasyOCR). Other open source projects of note are PaddleOCR (https://github.com/PaddlePaddle/PaddleOCR) and docTR (https://github.com/mindee/doctr).
-
Seeking Advice for Improving OCR Accuracy in a Code Snippet Reader Project
I think you can train tesseract with custom data if you have enough, or you can use deep learning models like https://pyimagesearch.com/2020/08/17/ocr-with-keras-tensorflow-and-deep-learning or https://www.google.com/amp/s/nanonets.com/blog/attention-ocr-for-text-recogntion/amp/ or try other existing tools like paddle-ocr https://github.com/PaddlePaddle/PaddleOCR
-
How do you parse tables in PDF with langchain? Especially, the context which is few lines above and below the table.
https://huggingface.co/blog/document-ai https://github.com/microsoft/table-transformer https://github.com/google-research/pix2struct https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/table/README.md
-
unable to install paddleocr on m1 mac
when following the installation commands present in the paddleocr repo(https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/quickstart_en.md) im still unable to install paddleocr. paddlepaddle is successfully installed on my m1 mac with python3.9.16 but while installing paddleocr im getting this error after long pip backtracking
-
Donut: OCR-Free Document Understanding Transformer
When I was evaluating options a few months ago I found https://github.com/PaddlePaddle/PaddleOCR to be a very strong contender for my use case (reading product labels), but you'll definitely want to put together some representative docs/images and test a bunch of solutions to see what works for you.
BoofCV
-
Recommended camera/projector calibration software?
BoofCV https://github.com/lessthanoptimal/BoofCV
-
JDK 21 - Image Performance Improvements
Is there any fast way to get pixel values and pixel coordinate? I had to jump through a lot of hoops to get convert BufferedImages into a format that's useful for image processing it to be a reasonable speed in BoofCV. getRGB() is glacial. At one point I was trying to convince the JDK team to make private data structures public again. Right now it's inconsistent what you have access too.
-
Mobile device based surface defect detection for manufacturing quality control
Apologizes if this post runs afoul of the rules. This is a product we are making that started as an open source project (here).
- BoofCV 0.40 Released. Micro QR Code, Transposed QR Codes, Speed improvements, Strict null enforcement
-
The Ancient Secrets of Computer Vision
I will take the opportunity to call out one of my favourite libraries, BoofCV (http://boofcv.org)
It comes with a wonderful demonstration tool that allows you to apply the various included algorithms to images and tweak the parameters in real-time – including the Hough transform. A great tool for helping to understand how these kinds of algorithms work!
-
Good Open Source Repositories that Accepts New Contributors
Speaking of using the vector api, I see there's a class in boofcv that converts rgb to hsv. I've previously written a simd accelerated version of rgb to hsv using the java vector api. For anybody looking to do a bit of code janitor work, converting the one-off library into something that could be contributed to /u/lessthanoptimal's project might be a worthwhile contribution.
-
JavaFX .jar (from clojure) won't find "glass" when run via `java -jar`
For instance I looked at BoofCV and all I found was : https://github.com/lessthanoptimal/BoofCV/issues/265 " the question now seems to be can you compile the library as native. The answer is probably but someone needs to try it."
-
BoofCV v0.38 Release Summary
It's an all-Java computer vision library, see https://boofcv.org/ and https://github.com/lessthanoptimal/BoofCV
-
BoofCV v0.38: Much improved scene reconstruction, loop closure, more concurrency. Also updated PyBoof
project website: https://boofcv.org
-
Feedback Requested on Updated BoofCV QR Code Tutorial
There really is a community for everything on Reddit... Anyways, so I've updated the tutorial on QR Codes in BoofCV. If you're not familiar with it, BoofCV is a computer vision library that also includes a high quality QR Code scanner. Plenty of benchmarks and examples can be found on the website to back that up. Here's a link to the tutorial page and please let me know if it all makes sense. The tutorial focuses on applications which are either command line or have a GUI.
What are some alternatives?
EasyOCR - Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
OpenCV - Open Source Computer Vision Library
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
mmocr - OpenMMLab Text Detection, Recognition and Understanding Toolbox
fSpy - A cross platform app for quick and easy still image camera matching
Tesseract.js - Pure Javascript OCR for more than 100 Languages 📖🎉🖥
OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
keras-ocr - A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
Codename One - Cross-platform framework for building truly native mobile apps with Java or Kotlin. Write Once Run Anywhere support for iOS, Android, Desktop & Web.