PDFMiner
ReportLab
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PDFMiner
-
Creating a python class for organizing courses I took in my education
Technically this information is on my transcript, so I will be trying to use pdfminer to extract that data if there is a way to use a class you recommend when using that code https://github.com/pdfminer/pdfminer.six
- Dúvida séria sobre Metadados!
- Desktop File Search, How does it work! python or C# library libraries?
-
Add Texts to existing PDF using Python
PDFMiner - for getting the fields of every possible PDF : only needed for weird cases where standard PDF reader methods do not work
-
I'm having trouble with the PDFminer library in Python. Whenever I try to call a certain function it says that something is missing in the library itself
The Github page says it's been superseded by pdfminer.six. Perhaps try that instead.
- Extract specific data from multiple PDF files
ReportLab
We haven't tracked posts mentioning ReportLab yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
PyPDF2 - A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
pdfplumber - Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
WeasyPrint - The awesome document factory
pdfminer.six - Community maintained fork of pdfminer - we fathom PDF
borb - borb is a library for reading, creating and manipulating PDF files in python.
Camelot - A Python library to extract tabular data from PDFs
pymorphy2 - Morphological analyzer / inflection engine for Russian and Ukrainian languages.
PyMuPDF - PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
pdftabextract - A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.