PaddleOCR
DeepFaceLive
DISCONTINUED
Our great sponsors
PaddleOCR | DeepFaceLive | |
---|---|---|
60 | 55 | |
37,652 | 13,912 | |
4.0% | - | |
8.6 | 8.4 | |
3 days ago | 10 months ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PaddleOCR
-
Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide
PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]
-
Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM
Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text
-
How would you go about driving contextual data from images?
For images with text, if you want to do visual qa, document classification, table/key information extraction, checkout https://huggingface.co/blog/document-ai https://github.com/philschmid/document-ai-transformers https://github.com/google-research/pix2struct https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/README.md
-
OCR at Edge on Cloudflare Constellation
EasyOCR is a popular project if you are in an environment where you can use run Python and PyTorch (https://github.com/JaidedAI/EasyOCR). Other open source projects of note are PaddleOCR (https://github.com/PaddlePaddle/PaddleOCR) and docTR (https://github.com/mindee/doctr).
-
How do you parse tables in PDF with langchain? Especially, the context which is few lines above and below the table.
https://huggingface.co/blog/document-ai https://github.com/microsoft/table-transformer https://github.com/google-research/pix2struct https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/table/README.md
-
Donut: OCR-Free Document Understanding Transformer
When I was evaluating options a few months ago I found https://github.com/PaddlePaddle/PaddleOCR to be a very strong contender for my use case (reading product labels), but you'll definitely want to put together some representative docs/images and test a bunch of solutions to see what works for you.
- [Python] [OCR] Un nouvel outil OCR avec une meilleure reconnaissance de texte pour les documents et les cartes.
-
[D] Can I use ML/AI to read the back panels of electronic components?
PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
-
Frog: OCR Tool for Linux
I’ve had good results from paddle ocr.
-
[OCR] The 24k star repo about OCR with 30+ languages supported including Chinese, Japanese .. and image conversion to excel file supported.
And you can find a lot of corpus and dictionaries in the pinned issue Multilingual OCR Development Plan from the community.
DeepFaceLive
- Is it possible to sync a lip and facial expression animation with audio in real time?
-
Is there a way to do facial rigs on AI images?
A more lifelike deformer would be running a 'deepfake' layer over your face motion into your 2D character face, but so far I haven't tried it yet. Here is some example of a well known open source 'faceswapper' : https://github.com/iperov/DeepFaceLive
- Animate your stable diffusion portraits
- Selfhosted AI
- Deepfakes in High-Resolution Created From a Single Photo
-
AI MoistCritical roasts the fuck out of Athene
Live video feed deepfake: DeepFaceLive
- Stop Developing This Technology
-
Keanu Reeves started streaming on Twitch
DeepFaceLive
Edit: Ah this is literally DFL, Keanu is another default face now: https://github.com/iperov/DeepFaceLive
I guess DeepFaceLive. Based on DeepFaceLab. Almost all DeepFakes are made with it.
What are some alternatives?
DeepFaceLab - DeepFaceLab is the leading software for creating deepfakes.
EasyOCR - Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
mmocr - OpenMMLab Text Detection, Recognition and Understanding Toolbox
Tesseract.js - Pure Javascript OCR for more than 100 Languages 📖🎉🖥
OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
keras-ocr - A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
OpenScan - A privacy-friendly Document Scanner app
onnx-simplifier - Simplify your onnx model
gImageReader - A Gtk/Qt front-end to tesseract-ocr.
Wav2Lip - This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs