Extract Data from PDF

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

pdfcpu

30 6,206 9.0 Go

A PDF processor written in Go.

Try https://github.com/pdfcpu/pdfcpu

grumpy

1 417 0.0 Go

Grumpy is a Python to Go source code transcompiler and runtime. (by grumpyhome)

So if that tool can read it, why not use it for conversion (calling from Go if you prefer)? Or have a look at the source to determine what it does to make the text readable. See also https://github.com/grumpyhome/grumpy

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
qpdf

18 3,032 9.6 C++

QPDF: A content-preserving PDF document transformer

UPDATE: We tried repairing the pdf in question and lo and behold, we got a result. As a tool for the repair we used qpdf (https://github.com/qpdf/qpdf/releases), after that the ledongthuc/pdf library had no hassle reading the data.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project