Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Tangential:
Some time ago I built an automation [1] that automatically identifies whether the given PDFs contain the specified keywords, outputting the result as a CSV file.
Similar to PDFGrep, probably much slower, but potentially more convenient for people preferring GUIs
Looking at the list of dependencies, it seems like they use poppler-cpp to render the PDFs.
DocQuery (https://github.com/impira/docquery), a project I work on, allows you to do something similar, but search over semantic information in the PDF files (using a large language model that is pre-trained to query business documents).
For example:
$ docquery scan "What is the due date?" /my/invoices/
For Emacs users there is also https://github.com/jeremy-compostella/pdfgrep which lets you browse the results and open the original docs highlighting the selected match.
I am working on looqs, it can do that (and also will render the page immediatly): https://github.com/quitesimpleorg/looqs
Related posts
- The Golden Ratio Package in Emacs: Perfect Window Resizing
- How to use Scrivener with Org Mode?
- org-novelist: Org Novelist is a system for writing novel-length fiction using Emacs Org mode.
- change-env: Change to and from any LaTeX environment, including display math—with label support!
- Org-novelist: a system for writing novel-length fiction using Emacs Org Mode