-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
https://github.com/Tulon/scantailor/releases/tag/EXPERIMENTAL_2015_06_20 This experimental and "older" version of scantailor has a very good automated curvature correction feature.. Another thing that I like is that it has a CLI function so you can script and automate running it across the entire book. As you can see from my attached image, it really cleans and flattens the image.
Ran the first open source command line time for OCR that I could find, in this case https://github.com/tesseract-ocr/tesseract .. the command was pretty straight forward: tesseract -l eng book.tif out_from_tiff Again.. a simple shell script should be easy enough to write and apply it to all pages. The output did have a form feed character at the bottom.. Obviously you can manually delete it but that would take forever.. so simply run..