Our great sponsors
-
donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
I am also interested in experimenting with something like DONUT (https://github.com/clovaai/donut) but I have never seen anything on what the VRAM expectations are for something like this. Does anyone know also if there are any newer better models than this for document parsing as well? Or what the VRAM requirements for something like this tend to be?
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- Ask HN: Why are all OCR outputs so raw?
- [D] Is there a good ai model for image-to-text where the images are diagrams and screenshots of interfaces?
- How to Automate Document Extraction from Insurance Documents
- Any way to convert my handwritten diary to searchable PDFs?
- Donut: OCR-Free Document Understanding Transformer