The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 23 PDF Open-Source Projects
-
quivr
Your GenAI Second Brain 🧠 A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented generation.
Project mention: privateGPT VS quivr - a user suggested alternative | libhunt.com/r/privateGPT | 2024-01-12 -
Project mention: How can I turn awesome-cv coverletter.tex and cv.tex into a single PDF? | /r/LaTeX | 2023-10-02
I am in the process of rewriting my CV using the [awesome-cv](https://github.com/posquit0/Awesome-CV) template and am pretty happy with how things are turning out.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
Stirling-PDF
locally hosted web application that allows you to perform various operations on PDF files
-
paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
I steered a friend towards Paperless (and away from an LLM solution) as a way of searching/accessing GBs of architectural PDFs recently - so far, it’s apparently working well for them.
-
-
best-resume-ever
:necktie: :briefcase: Build fast :rocket: and easy multiple beautiful resumes and create your best CV ever! Made with Vue and LESS.
-
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
koodo-reader
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
-
koreader
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
-
I am playing around with this github project, which takes a user question as input and immediately runs a vector search on it to find relevant storied information before delivering an answer.
-
Using react-pdf, we crafted a solution that allowed users to manipulate their reports with an impressive degree of flexibility. But, as data grew (imagine trying to cram an entire financial year's worth of invoices, up to 22,000 rows, into one PDF), our solution began to falter, especially on older PCs with limited resources.
-
Do you mind reporting those issues either to SumatraPDF at https://github.com/sumatrapdfreader/sumatrapdf/issues or directly to MuPDF at https://bugs.ghostscript.com/ if it also has the same issue? Thank you!
There are many wonderfully weird PDFs and epubs out there, but we do our best to fix issues. :)
-
mit-deep-learning-book-pdf
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
-
Project mention: TextSnatcher: Copy text from images, for the Linux Desktop | news.ycombinator.com | 2024-03-14
Try https://github.com/ocrmypdf/OCRmyPDF - it uses Tesseract behind the scenes and it absolutely brilliant.
-
milewski-ctfp-pdf
Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source
Project mention: reflect-cpp - Now with compile time extraction of field names from structs and enums using C++-20. | /r/cpp | 2023-12-09Category Theory for Programmers by Bartosz Milewski (https://github.com/hmemcpy/milewski-ctfp-pdf/releases)
-
QuestPDF
QuestPDF is a modern open-source .NET library for PDF document generation. Offering comprehensive layout engine powered by concise and discoverable C# Fluent API. Easily generate PDF reports, invoices, exports, etc.
QuestPDF looks really good (I haven't used it) but I believe they changed their license recently.
-
Project mention: Intro to DOMPDF - lightest and simplest PHP library to generate PDF documents | dev.to | 2024-04-05
Generating PDF documents out of your app's HTML output is a very common requirement and there are several open source libraries to accomplish this. I came across this need for my project recently and I evaluated many popular ones such as TCPDF, mpdf, FPDF, etc. But the one that truly stood up to my evaluation in terms of efficiency (minimal footprint) and ease of implementation was DOMPDF.
-
xournalpp
Xournal++ is a handwriting notetaking software with PDF annotation support. Written in C++ with GTK3, supporting Linux (e.g. Ubuntu, Debian, Arch, SUSE), macOS and Windows 10. Supports pen input from devices such as Wacom Tablets.
Project mention: Rnote – An open-source vector-based drawing app | news.ycombinator.com | 2024-03-11I highly recommend Rnote to anyone on Linux that misses the "hodgepodge" notetaking of apps like OneNote. It works like a dream on touchscreens and drawing tablets, with a surprising amount of configuration under the hood.
Also worth noting is Xournal, an older but similar project: https://xournalpp.github.io/
-
-
After some research, I found libvips, a demand-driven, horizontally threaded image processing library. It is designed to run quickly while using as little as memory as possible.
-
Project mention: 33 React Libraries Every React Developer Should Have In Their Arsenal | dev.to | 2024-01-07
23.react-pdf
-
PyPDF2
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
PDF related posts
- KOReader Document Viewer for E Ink devices
- Plato Document Reader for Kobo E-Readers
- Intro to DOMPDF - lightest and simplest PHP library to generate PDF documents
- When Will the GenAI Bubble Burst?
- Show HN: I just open sourced my document/website extractor for Vision-LLMs
- Running OCR against PDFs and images directly in the browser
- HTML to PDF renderers: A simple comparison
-
A note from our sponsor - WorkOS
workos.com | 18 Apr 2024
Index
What are some of the best open-source PDF projects? This list will help you:
Project | Stars | |
---|---|---|
1 | quivr | 31,972 |
2 | Awesome-CV | 21,704 |
3 | Stirling-PDF | 21,464 |
4 | paperless-ngx | 16,576 |
5 | awesome-english-ebooks | 16,350 |
6 | best-resume-ever | 16,218 |
7 | Etherpad | 15,798 |
8 | koodo-reader | 15,478 |
9 | koreader | 15,126 |
10 | gpt4-pdf-chatbot-langchain | 14,520 |
11 | react-pdf | 14,080 |
12 | sumatrapdf | 12,529 |
13 | mit-deep-learning-book-pdf | 12,284 |
14 | OCRmyPDF | 11,866 |
15 | milewski-ctfp-pdf | 10,733 |
16 | QuestPDF | 10,403 |
17 | Dompdf | 10,252 |
18 | xournalpp | 10,180 |
19 | Zettlr | 9,587 |
20 | libvips | 8,958 |
21 | react-pdf | 8,514 |
22 | PyPDF2 | 7,359 |
23 | PHPWord | 7,097 |