Recovering redacted information from pixelated videos

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • local_adaptive_binarization

    Local adaptive image binarization

  • Not off the shelf but here are some tools. I have no experience with them.

    Wolf binarization - I think it makes the text more clear before OCR.

    https://github.com/chriswolfvision/local_adaptive_binarizati...

    This thing OCRs the pdf using Tesseract OCR

    https://github.com/ocrmypdf/OCRmyPDF/

    Two other pdf tools

    https://github.com/qpdf/qpdf

    https://github.com/pikepdf/pikepdf

  • Not off the shelf but here are some tools. I have no experience with them.

    Wolf binarization - I think it makes the text more clear before OCR.

    https://github.com/chriswolfvision/local_adaptive_binarizati...

    This thing OCRs the pdf using Tesseract OCR

    https://github.com/ocrmypdf/OCRmyPDF/

    Two other pdf tools

    https://github.com/qpdf/qpdf

    https://github.com/pikepdf/pikepdf

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  • Not off the shelf but here are some tools. I have no experience with them.

    Wolf binarization - I think it makes the text more clear before OCR.

    https://github.com/chriswolfvision/local_adaptive_binarizati...

    This thing OCRs the pdf using Tesseract OCR

    https://github.com/ocrmypdf/OCRmyPDF/

    Two other pdf tools

    https://github.com/qpdf/qpdf

    https://github.com/pikepdf/pikepdf

  • qpdf

    QPDF: A content-preserving PDF document transformer

  • Not off the shelf but here are some tools. I have no experience with them.

    Wolf binarization - I think it makes the text more clear before OCR.

    https://github.com/chriswolfvision/local_adaptive_binarizati...

    This thing OCRs the pdf using Tesseract OCR

    https://github.com/ocrmypdf/OCRmyPDF/

    Two other pdf tools

    https://github.com/qpdf/qpdf

    https://github.com/pikepdf/pikepdf

  • pikepdf

    A Python library for reading and writing PDF, powered by QPDF

  • Not off the shelf but here are some tools. I have no experience with them.

    Wolf binarization - I think it makes the text more clear before OCR.

    https://github.com/chriswolfvision/local_adaptive_binarizati...

    This thing OCRs the pdf using Tesseract OCR

    https://github.com/ocrmypdf/OCRmyPDF/

    Two other pdf tools

    https://github.com/qpdf/qpdf

    https://github.com/pikepdf/pikepdf

  • LaTeX-OCR

    pix2tex: Using a ViT to convert images of equations into LaTeX code.

  • im2markup

    Neural model for converting Image-to-Markup (by Yuntian Deng yuntiandeng.com)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts