Our great sponsors
-
This is missing the "pdfsizeopt" suite, that bundles statically compiled utilities to reduce size.
Static compilation means that it will run on most Linux platforms without extra required software.
I believe one aspect of it will remove characters from included fonts that are not used.
It really is quite impressive.
-
> Would love to find a cheaper (local) option vs AWS
How about tesseract (https://github.com/tesseract-ocr/tesseract)
-
Mergify
Updating dependencies is time-consuming.. Solutions like Dependabot or Renovate update but don't merge dependencies. You need to do it manually while it could be fully automated! Add a Merge Queue to your workflow and stop caring about PR management & merging. Try Mergify for free.
-
There’s even a library for php (https://github.com/thiagoalessio/tesseract-ocr-for-php). Haven’t used it. I did used python Pytesseract & works fairly well.
-
PDFBox can do this. It’s not part of the CLI but it wouldn’t be too hard to add:
https://github.com/apache/pdfbox/blob/5b00807463279f1002e245...
-
This tool might be helpful for comparing pdfs: https://github.com/serhack/pdf-diff
-
I'd like to add this tool to the list: https://pdfsam.org/
-
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.