Python pdf-documents

Open-source Python projects categorized as pdf-documents

Top 4 Python pdf-document Projects

pdf-documents
  1. PyPDF2

    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

    Project mention: Using Docling’s OCR features with RapidOCR | dev.to | 2025-04-03
  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. PyMuPDF

    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

    Project mention: Using Docling’s OCR features with RapidOCR | dev.to | 2025-04-03
  4. pypdfium2

    Python bindings to PDFium

    Project mention: Using Docling’s OCR features with RapidOCR | dev.to | 2025-04-03
  5. pdfalyzer

    Analyze PDFs. With colors. And Yara.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python pdf-documents discussion

Log in or Post with

Python pdf-documents related posts

  • Using Docling’s OCR features with RapidOCR

    9 projects | dev.to | 3 Apr 2025

Index

What are some of the best open-source pdf-document projects in Python? This list will help you:

# Project Stars
1 PyPDF2 9,065
2 PyMuPDF 7,168
3 pypdfium2 571
4 pdfalyzer 265

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com