Jupyter Notebook OCR

Open-source Jupyter Notebook projects categorized as OCR

Top 14 Jupyter Notebook OCR Projects

  • deep-text-recognition-benchmark

    Text recognition (optical character recognition) with deep learning methods, ICCV 2019

  • Pix2Text

    Pix In, Latex & Text Out. Recognize Chinese, English Texts, and Math Formulas from Images. 80+ languages are supported.

    Project mention: How do I solve this? | /r/LaTeX | 2023-06-11

    Use this: https://p2t.behye.com/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • tarsier

    Vision utilities for web interaction agents 👀

    Project mention: Control the browser using GPT-4 vision by AgentGPT team | news.ycombinator.com | 2023-11-12
  • PyMuPDF-Utilities

    Demos, examples and utilities using PyMuPDF

    Project mention: Anybody has code for a gui app to extract images from several pdfs at once? | /r/Python | 2023-05-13
  • deep-text-recognition-benchmark

    PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR) (by roatienza)

  • Multi-Type-TD-TSR

    Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

  • ocrpy

    OCR, Archive, Index and Search: Implementation agnostic OCR framework.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • document-ai-samples

    Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud

    Project mention: When Will the GenAI Bubble Burst? | news.ycombinator.com | 2024-04-04

    Thanks for the example and that sounds really solid cost savings and definitely agree with the trend that it is here to stay.

    For invoice parsing (various formats), are you just using GPT4V? When GPT4V initially came out, i benchmarked it against an out of the box invoice parser from Google Cloud (https://cloud.google.com/document-ai) on 16 documents and it was much better accuracy wise. For ex: i'd get results parsing 10,100 as 101100 (no comma).

    Curious if you saw problems like this in your pipeline or if its gotten much better since?

  • Calliar

    A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.

  • videocr-PaddleOCR

    Extract hardcoded subtitles from videos using machine learning

  • tutorials

    Git Repo for Articles on Ergo Sum blog and the youtube channel https://www.youtube.com/channel/UCiie9CN--dazA7iT2sry5FA (by rogerfitz)

  • Easter2

    Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION

  • konfuzio-sdk

    OCR, extract and classify documents. In addition, annotate documents and build your own NLP and Computer Vision models using Python by downloading the data. Find examples in our Colab Notebooks, e. g. how to fine-tune Flair.

  • docutron

    Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.

    Project mention: Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents | /r/Python | 2023-10-24
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-04.

Jupyter Notebook OCR related posts

Index

What are some of the best open-source OCR projects in Jupyter Notebook? This list will help you:

Project Stars
1 deep-text-recognition-benchmark 3,613
2 Pix2Text 1,277
3 tarsier 478
4 PyMuPDF-Utilities 463
5 deep-text-recognition-benchmark 275
6 Multi-Type-TD-TSR 236
7 ocrpy 218
8 document-ai-samples 181
9 Calliar 136
10 videocr-PaddleOCR 106
11 tutorials 79
12 Easter2 73
13 konfuzio-sdk 52
14 docutron 16
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com