docutron
genalog
docutron | genalog | |
---|---|---|
2 | 1 | |
17 | 296 | |
- | 1.4% | |
5.8 | 0.0 | |
7 months ago | 4 months ago | |
Jupyter Notebook | Jupyter Notebook | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
docutron
genalog
-
Microsoft Unveils Genalog: An Open Source, AI Cross-Platform Python Package For Generating Document Images With Synthetic Noise
Github: https://github.com/microsoft/genalog
What are some alternatives?
deep-text-recognition-benchmark - Text recognition (optical character recognition) with deep learning methods, ICCV 2019
unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
SDV - Synthetic data generation for tabular data
document-ai-samples - Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
synthetic-data-genomics - Proof of concept code from Gretel.ai and Illumina using generative neural networks to create synthetic versions of mouse genotype and phenotype data.
videocr-PaddleOCR - Extract hardcoded subtitles from videos using machine learning
Copulas - A library to model multivariate data using copulas.
ocrpy - OCR, Archive, Index and Search: Implementation agnostic OCR framework.
ML-For-Beginners - 12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Multi-Type-TD-TSR - Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
nist-crc-2023 - NIST Collaborative Research Cycle on Synthetic Data. Learn about Synthetic Data week by week!