videocr-PaddleOCR
docutron
videocr-PaddleOCR | docutron | |
---|---|---|
3 | 2 | |
110 | 17 | |
- | - | |
4.2 | 5.8 | |
3 months ago | 6 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
videocr-PaddleOCR
-
how to run this python program, total noob
i almost never used python and i wanted to use this program https://github.com/oliverfei/videocr-PaddleOCR or this https://github.com/apm1467/videocr but i don't understand how to use them, if someone can tell it will be a huge hand, in the second one there is also my language so i think it's better
-
Anyone know any mobile apps or websites I can use to start translating donghua? Got some time on my hands and nothing to do
A lot of times, Chinese videos will have subtitles hardcoded in them. I've worked on a Python library that can extract these using OCR which should save some time when it comes to getting the proper timings: https://github.com/oliverfei/videocr-PaddleOCR
-
I want to make English subtitles for a series…
For extracting subtitles and most importantly their timings from videos with hardcoded subtitles, I recently worked on this open source python library: https://github.com/oliverfei/videocr-PaddleOCR. For those familiar with coding, you could probably also write a script to run the extracted subtitles through machine translation services e.g. with this library. I've been meaning to build this into a more user friendly application in the future when I have the time.
docutron
What are some alternatives?
Aegisub - Cross-platform advanced subtitle editor
deep-text-recognition-benchmark - Text recognition (optical character recognition) with deep learning methods, ICCV 2019
Paddle2ONNX - ONNX Model Exporter for PaddlePaddle
unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Paperless-ng - A supercharged version of paperless: scan, index and archive all your physical documents
genalog - Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
HandBrake - HandBrake's main development repository
document-ai-samples - Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
Multi-Type-TD-TSR - Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
ocrpy - OCR, Archive, Index and Search: Implementation agnostic OCR framework.