Ask HN: What's a good library/command line tool to extract tables from PDFs?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • excalibur

    A web interface to extract tabular data from PDFs (by camelot-dev)

  • have not tried it, but this has been in my bookmarks a while: https://github.com/camelot-dev/excalibur

  • tabulapdf

    Bindings for Tabula PDF Table Extractor Library

  • there is also this option: https://docs.ropensci.org/tabulizer/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • What is the best library for processing table data contained within a PDF?

    2 projects | /r/dotnet | 23 Jun 2023
  • Is there OCR software where I can draw an outline of the columns and rows myself to extract PDF table repeatedly.

    1 project | /r/techsupport | 1 Mar 2023
  • Is it possible to write a script that copies data from a pdf file to an Excel?

    1 project | /r/learnpython | 12 Apr 2021
  • What is the best way to extract tables from scanned pdf's?

    1 project | /r/learnpython | 10 Nov 2022
  • software to convert pdf tables to Excel

    1 project | /r/BusinessIntelligence | 26 Apr 2022