Looking for advice on scanning a book into ebook format

This page summarizes the projects mentioned and recommended in the original post on /r/DataHoarder

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • scantailor

  • https://github.com/Tulon/scantailor/releases/tag/EXPERIMENTAL_2015_06_20 This experimental and "older" version of scantailor has a very good automated curvature correction feature.. Another thing that I like is that it has a CLI function so you can script and automate running it across the entire book. As you can see from my attached image, it really cleans and flattens the image.

  • tesseract-ocr

    Tesseract Open Source OCR Engine (main repository)

  • Ran the first open source command line time for OCR that I could find, in this case https://github.com/tesseract-ocr/tesseract .. the command was pretty straight forward: tesseract -l eng book.tif out_from_tiff Again.. a simple shell script should be easy enough to write and apply it to all pages. The output did have a form feed character at the bottom.. Obviously you can manually delete it but that would take forever.. so simply run..

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Highlighting Image Text

    1 project | dev.to | 30 Apr 2024
  • one of the Codia AI Design technologies: OCR Technology

    1 project | dev.to | 14 Feb 2024
  • OCR text to speech for disability

    1 project | /r/AskProgramming | 10 Dec 2023
  • How to Read Text From an Image with Python

    1 project | dev.to | 23 Oct 2023
  • I used Node.js to OCR "Meme Monday" threads

    1 project | dev.to | 5 Aug 2023