tabula
awesome-document-understanding
Our great sponsors
tabula | awesome-document-understanding | |
---|---|---|
11 | 4 | |
6,475 | 1,072 | |
0.9% | - | |
0.0 | 4.5 | |
6 months ago | 10 months ago | |
CSS | ||
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tabula
- Automatisches Auslesen von PDFs
-
Ruby
Another option would be JRuby. I routinely use an application called Tabula, which is built using JRuby and compiles to a Jar file. This, of course, requires Java on the target machine, but you can ship the Jar file and it will work. It's often easier to rely on a working Java environment than it is a working Ruby environment. Especially on Windows.
- I am looking to automate a process at work...
-
Pdfsandwich
While trying to find a specific project I recalled, I encountered this list of projects which might be of interest: https://github.com/tstanislawek/awesome-document-understandi...
The project I had in mind was similar to this one but I can't remember the name currently: https://github.com/tabulapdf/tabula
However, if you're looking for a ML-based, invoice-specific project looks like the other comment to your reply might be more useful.
- Tabula: Liberate Data From PDF Tables [jRuby]
awesome-document-understanding
-
Pdfsandwich
While trying to find a specific project I recalled, I encountered this list of projects which might be of interest: https://github.com/tstanislawek/awesome-document-understandi...
The project I had in mind was similar to this one but I can't remember the name currently: https://github.com/tabulapdf/tabula
However, if you're looking for a ML-based, invoice-specific project looks like the other comment to your reply might be more useful.
-
Extract informations from invoices with machine learning
Check out this repository for inspiration: https://github.com/tstanislawek/awesome-document-understanding
What are some alternatives?
InvoiceNet - Deep neural network to extract intelligent information from invoice documents.
obsidian-notion-like-tables - Your premiere tool for creating and managing tabular data in Obsidian.md
Apache PDFBox - Mirror of Apache PDFBox
unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Awesome-pytorch-list - A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
awesome-document-understandi
ripgrep-all - rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
awesome-ocr
awesome-english-ebooks - 经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新
OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
laravel-report-generator - Rapidly Generate Simple Pdf, CSV, & Excel Report Package on Laravel