PDF

Top 23 PDF Open-Source Projects

  • quivr

    Your GenAI Second Brain 🧠 A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented generation.

    Project mention: privateGPT VS quivr - a user suggested alternative | libhunt.com/r/privateGPT | 2024-01-12
  • Awesome-CV

    :page_facing_up: Awesome CV is LaTeX template for your outstanding job application

    Project mention: How can I turn awesome-cv coverletter.tex and cv.tex into a single PDF? | /r/LaTeX | 2023-10-02

    I am in the process of rewriting my CV using the [awesome-cv](https://github.com/posquit0/Awesome-CV) template and am pretty happy with how things are turning out.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

  • Stirling-PDF

    locally hosted web application that allows you to perform various operations on PDF files

    Project mention: FLaNK Weekly 31 December 2023 | dev.to | 2023-12-31
  • paperless-ngx

    A community-supported supercharged version of paperless: scan, index and archive all your physical documents

    Project mention: I accidentally built a meme search engine | news.ycombinator.com | 2024-04-13

    I steered a friend towards Paperless (and away from an LLM solution) as a way of searching/accessing GBs of architectural PDFs recently - so far, it’s apparently working well for them.

    https://github.com/paperless-ngx/paperless-ngx

  • awesome-english-ebooks

    经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新

  • best-resume-ever

    :necktie: :briefcase: Build fast :rocket: and easy multiple beautiful resumes and create your best CV ever! Made with Vue and LESS.

  • Etherpad

    Etherpad: A modern really-real-time collaborative document editor.

    Project mention: Edit This Blog Post | news.ycombinator.com | 2024-02-06
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • koodo-reader

    A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web

  • koreader

    An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices

    Project mention: KOReader Document Viewer for E Ink devices | news.ycombinator.com | 2024-04-10

    [2]: https://github.com/koreader/koreader/wiki/Dictionary-support...

  • gpt4-pdf-chatbot-langchain

    GPT4 & LangChain Chatbot for large PDF docs

    Project mention: Back and forth conversations before a vector search? | /r/LangChain | 2023-08-30

    I am playing around with this github project, which takes a user question as input and immediately runs a vector search on it to find relevant storied information before delivering an answer.

  • react-pdf

    📄 Create PDF files using React

    Project mention: How we improved our client-side PDF generation by 5x | dev.to | 2024-03-17

    Using react-pdf, we crafted a solution that allowed users to manipulate their reports with an impressive degree of flexibility. But, as data grew (imagine trying to cram an entire financial year's worth of invoices, up to 22,000 rows, into one PDF), our solution began to falter, especially on older PCs with limited resources.

  • sumatrapdf

    SumatraPDF reader

    Project mention: SumatraPDF Reader | news.ycombinator.com | 2023-10-23

    Do you mind reporting those issues either to SumatraPDF at https://github.com/sumatrapdfreader/sumatrapdf/issues or directly to MuPDF at https://bugs.ghostscript.com/ if it also has the same issue? Thank you!

    There are many wonderfully weird PDFs and epubs out there, but we do our best to fix issues. :)

  • mit-deep-learning-book-pdf

    MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville

    Project mention: Deep Learning Course | news.ycombinator.com | 2023-11-19
  • OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

    Project mention: TextSnatcher: Copy text from images, for the Linux Desktop | news.ycombinator.com | 2024-03-14

    Try https://github.com/ocrmypdf/OCRmyPDF - it uses Tesseract behind the scenes and it absolutely brilliant.

  • milewski-ctfp-pdf

    Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source

    Project mention: reflect-cpp - Now with compile time extraction of field names from structs and enums using C++-20. | /r/cpp | 2023-12-09

    Category Theory for Programmers by Bartosz Milewski (https://github.com/hmemcpy/milewski-ctfp-pdf/releases)

  • QuestPDF

    QuestPDF is a modern open-source .NET library for PDF document generation. Offering comprehensive layout engine powered by concise and discoverable C# Fluent API. Easily generate PDF reports, invoices, exports, etc.

    Project mention: How do you generate pdf files with charts? | /r/dotnet | 2023-06-21

    QuestPDF looks really good (I haven't used it) but I believe they changed their license recently.

  • Dompdf

    HTML to PDF converter for PHP

    Project mention: Intro to DOMPDF - lightest and simplest PHP library to generate PDF documents | dev.to | 2024-04-05

    Generating PDF documents out of your app's HTML output is a very common requirement and there are several open source libraries to accomplish this. I came across this need for my project recently and I evaluated many popular ones such as TCPDF, mpdf, FPDF, etc. But the one that truly stood up to my evaluation in terms of efficiency (minimal footprint) and ease of implementation was DOMPDF.

  • xournalpp

    Xournal++ is a handwriting notetaking software with PDF annotation support. Written in C++ with GTK3, supporting Linux (e.g. Ubuntu, Debian, Arch, SUSE), macOS and Windows 10. Supports pen input from devices such as Wacom Tablets.

    Project mention: Rnote – An open-source vector-based drawing app | news.ycombinator.com | 2024-03-11

    I highly recommend Rnote to anyone on Linux that misses the "hodgepodge" notetaking of apps like OneNote. It works like a dream on touchscreens and drawing tablets, with a surprising amount of configuration under the hood.

    Also worth noting is Xournal, an older but similar project: https://xournalpp.github.io/

  • Zettlr

    Your One-Stop Publication Workbench

    Project mention: Obsidian 1.5 Desktop (Public) | news.ycombinator.com | 2023-12-26
  • libvips

    A fast image processing library with low memory needs.

    Project mention: Building an online image compressor | dev.to | 2024-01-09

    After some research, I found libvips, a demand-driven, horizontally threaded image processing library. It is designed to run quickly while using as little as memory as possible.

  • react-pdf

    Display PDFs in your React app as easily as if they were images. (by wojtekmaj)

    Project mention: 33 React Libraries Every React Developer Should Have In Their Arsenal | dev.to | 2024-01-07

    23.react-pdf

  • PyPDF2

    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

    Project mention: Yara scanning PDF files | /r/computerforensics | 2023-06-01
  • PHPWord

    A pure PHP library for reading and writing word processing documents

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-13.

PDF related posts

Index

What are some of the best open-source PDF projects? This list will help you:

Project Stars
1 quivr 31,972
2 Awesome-CV 21,704
3 Stirling-PDF 21,464
4 paperless-ngx 16,576
5 awesome-english-ebooks 16,350
6 best-resume-ever 16,218
7 Etherpad 15,798
8 koodo-reader 15,478
9 koreader 15,126
10 gpt4-pdf-chatbot-langchain 14,520
11 react-pdf 14,080
12 sumatrapdf 12,529
13 mit-deep-learning-book-pdf 12,284
14 OCRmyPDF 11,866
15 milewski-ctfp-pdf 10,733
16 QuestPDF 10,403
17 Dompdf 10,252
18 xournalpp 10,180
19 Zettlr 9,587
20 libvips 8,958
21 react-pdf 8,514
22 PyPDF2 7,359
23 PHPWord 7,097
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com