i7j-rups
PyMuPDF
i7j-rups | PyMuPDF | |
---|---|---|
3 | 5 | |
248 | 4,053 | |
0.8% | 4.1% | |
5.3 | 9.8 | |
12 days ago | 8 days ago | |
Java | Python | |
GNU General Public License v3.0 or later | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
i7j-rups
-
So you want to modify the text of a PDF by hand
Great post. I've spend a lot of time reading through the PDF specification over the last ~5 years while building DocSpring [1], and I still feel like I've barely scratched the surface. qpdf is a great tool. One of my other favorites is RUPS [2], which really lets you dig into the structure of a PDF.
[1] https://docspring.com
[2] https://github.com/itext/i7j-rups
-
Show HN: I am building a new Python library to read/write PDF files
> find a version of iText RUPS application from somewhere on the internet
You mean this, right? https://github.com/itext/i7j-rups#readme
-
Any decent free online tool which can give me a breakdown of pdf contents including relative sizes of assets such as images, fonts, etc?
It's not an online tool, but it's free nonetheless: https://github.com/itext/i7j-rups
PyMuPDF
- FLaNK Stack for 04 December 2023
-
Converting markdown to pdf in Python
This method is based on the use of the libraries markdown-it-py (conversion from markdown to html) and [PyMuPDF] https://github.com/pymupdf/PyMuPDF) (conversion from html to pdf). A small Python class links them together.
-
Show HN: I am building a new Python library to read/write PDF files
I think you might mean PyMuPDF (https://github.com/pymupdf/PyMuPDF), a Python library built on top of the MuPDF C library (https://mupdf.com/).
PyMuPDF and MuPDF are both available under dual open source AGPL and commercial licenses. They have been around for many years and are under continual development.
[Disclaimer, i work for Artifex, who wrote MuPDF and recently acquired PyMuPDF.]
- M1 Mac: myuPDF install (wheel?)
- legacy install error: PyMuPDF?
What are some alternatives?
pdfsyntax - A Python library to inspect and modify the internal structure of a PDF file
PyPDF2 - A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
djot - A light markup language
ReportLab
annotated-pdf-spec - Collection of useful hints for implementing a PDF library
pdfplumber - Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
kaitai_struct_formats - Kaitai Struct: library of binary file formats (.ksy)
borb - borb is a library for reading, creating and manipulating PDF files in python.
bericht - Incremental HTML to PDF converter.
PDFMiner - Python PDF Parser (Not actively maintained). Check out pdfminer.six.
polyfile - A pure Python cleanroom implementation of libmagic, with instrumented parsing from Kaitai struct and an interactive hex viewer
pdfquery - A fast and friendly PDF scraping library.