Crates for converting PDF's into Markdown

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • pdf-extract

    A rust library for extracting content from pdfs

  • https://github.com/jrmuizel/pdf-extract could be extended to do something like this.

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • layout-parser

    A Unified Toolkit for Deep Learning Based Document Image Analysis

  • I built my own solution using a combination of Tesseract and OpenCV (in python). But even though the source PDF content is computer generated, I still get sporadic OCR errors. After writing my solution, I came across this https://github.com/Layout-Parser/layout-parser which might be a better starting point for dealing with PDFs but I haven't tried it yet.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • OCR help required

    1 project | /r/Python | 18 Oct 2022
  • Amateur programmer here. Will Rust be used in backend for software in the future?

    2 projects | /r/rust | 27 May 2022
  • A Python Library for Document Layout Understanding

    1 project | news.ycombinator.com | 8 Apr 2021
  • Document Classification

    2 projects | /r/computervision | 8 Jun 2021
  • Supervision: Reusable Computer Vision

    5 projects | news.ycombinator.com | 24 Mar 2024