Base64.ai – Extract text, data, photos and more from all types of docs

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  • I looked into OCR a while ago for some hundreds of thousands of pages of PDF. All hosted offerings would end up costing quite a bit.

    After looking at options and few tests, I figured I'd use https://github.com/jbarlow83/OCRmyPDF

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • DeepSpeech

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

  • Should be able to use ffmpeg[0] to extract a single frame each second/keyframe (doubtful it's worth doing every single frame) and then pass it to tesseract.

    For speech to text.. if english, try mozilla's deepspeech? https://github.com/mozilla/DeepSpeech

    Might be fun to try.

    [0] https://stackoverflow.com/questions/27568254/how-to-extract-...

  • silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

  • For speech-to-text extraction you can try Silero [1].

    Free software (AGPL-3.0 License), fast, highly accurate and extremely simple to deploy (I have no affiliation with them).

    [1] https://github.com/snakers4/silero-models

  • invoice2data

    Extract structured data from PDF invoices

  • It's not really working. Tried 2 English PDF invoices. Normal format. One came back empty, the other only had the amount right.

    I'm assuming they only trained on some specific documents (passport of country X, etc) and all others don't work.

    If someone processes the same document all the time, then my invoice2data project may work better and is open source. It's based on Regx, rather than machine learning: https://github.com/invoice-x/invoice2data

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Ask HN: Offline, Embeddable Speech Recognition?

    4 projects | news.ycombinator.com | 19 Aug 2022
  • Show HN: State-of-the-Art German Speech Recognition in 284 lines of C++

    5 projects | news.ycombinator.com | 10 Aug 2022
  • Transcribe Speech to Text with Python for Free

    1 project | /r/programming | 30 Mar 2022
  • Voice to Text Options that respect privacy

    1 project | /r/privacy | 3 Jan 2022
  • Mozilla Common Voice Adds 16 New Languages and 4,600 New Hours of Speech

    12 projects | news.ycombinator.com | 5 Aug 2021