Pdfminer.six Alternatives

Similar projects and alternatives to pdfminer.six

PyPDF2

30 7,396 9.5 Python pdfminer.six VS PyPDF2

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
pdfplumber

29 5,527 8.4 Python pdfminer.six VS pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Jina AI examples

22 403 9.6 Python pdfminer.six VS Jina AI examples

Discontinued Jina examples and demos to help you get started (by jina-ai)
PDFMiner

6 5,179 0.0 Python pdfminer.six VS PDFMiner

Python PDF Parser (Not actively maintained). Check out pdfminer.six.
gptty

12 47 5.5 Python pdfminer.six VS gptty

ChatGPT wrapper in your TTY
tabula-py

4 2,054 7.2 Python pdfminer.six VS tabula-py

Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame
OCRmyPDF

77 11,936 9.6 Python pdfminer.six VS OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
PyPDF2

1 4,162 8.9 Python pdfminer.six VS PyPDF2

Discontinued A utility to read and write PDFs with Python [Moved to: https://github.com/py-pdf/PyPDF2] (by mstamy2)
WeasyPrint

43 6,635 9.4 Python pdfminer.six VS WeasyPrint

The awesome document factory
notion-export-client

2 143 3.6 JavaScript pdfminer.six VS notion-export-client

Notion备份客户端工具，将指定Notion page单向转换为本地markdown文件 | Notion backup client, backup some page to markdown files self-struct
textract-cli

2 6 3.3 Python pdfminer.six VS textract-cli

CLI utility for using AWS Textract DetectDocumentText to OCR image files in synchronous mode without uploading to S3.
GPT_Terminal

2 0 10.0 Python pdfminer.six VS GPT_Terminal

Discontinued A command line ai assistant with customizable preheader for adding style and formatting
anvil-parser

1 84 0.0 Python pdfminer.six VS anvil-parser

Discontinued A Minecraft anvil file format parser

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better pdfminer.six alternative or higher similarity.

Suggest an alternative to pdfminer.six

pdfminer.six reviews and mentions

Posts with mentions or reviews of pdfminer.six. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-02.

Code to extract text from pdf to excel
2 projects | /r/Python | 2 Jun 2023

I love to use PDFMiner and PDFQuery for this https://github.com/pdfminer/pdfminer.six https://towardsdatascience.com/scrape-data-from-pdf-files-using-python-and-pdfquery-d033721c3b28
Advanced PDF to Excel with documents and example code
2 projects | /r/learnpython | 1 May 2023
how do I automate extracting data from two pdfs and input into an excel sheet according to an order number
2 projects | /r/learnpython | 24 Apr 2023

Entering things in Excel is very easy. Extracting things from PDF is a pain. This (https://github.com/pdfminer/pdfminer.six) gets pretty close to what you need, but it may be easier to use this to just convert the entire PDF to text and parse the text to extract the info you need.
Can I make a code to compare a pdf file and an excel sheet by line by line tell the difference in amounts?
1 project | /r/learnpython | 14 Apr 2023
How do I now access GPT-4? I click the link but it just takes me to the information page, I don’t have access to it on the API playground page.
4 projects | /r/OpenAI | 29 Mar 2023

Convert pdf to string https://github.com/pdfminer/pdfminer.six
Extracting text from PDFs using pdfminer
1 project | /r/learnpython | 23 Jan 2023
Recommendations for parsing text from .pdf files
2 projects | /r/Python | 14 Dec 2022

Now I see that the project is abandoned but there's an active fork called pdfminer.six . Hope that helps.
Creating a python class for organizing courses I took in my education
2 projects | /r/learnpython | 15 Oct 2022

Technically this information is on my transcript, so I will be trying to use pdfminer to extract that data if there is a way to use a class you recommend when using that code https://github.com/pdfminer/pdfminer.six
Show HN: Search PDFs with Transformers and Python Notebook
4 projects | news.ycombinator.com | 25 Jul 2022
Best tools for PDF Scraping?
1 project | /r/datascience | 1 Jun 2022
A note from our sponsor - InfluxDB
www.influxdata.com | 26 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →