Our great sponsors
- InfluxDB - Collect and Analyze Billions of Data Points in Real Time
- Onboard AI - Learn any GitHub repo in 59 seconds
- SaaSHub - Software Alternatives and Reviews
-
> For the scientific literature, we need a ChatGPT equivalent to reconstruct LaTeX source that can reproduce each page. (We really need a successor to LaTeX that isn't such an arcane language, and can author fixed and flowable text with equal ease.)
Check out Nougat: OCRing scientific papers with a deep net trained end to end. It was released by Meta a few days ago.
“PDF format leads to a loss of semantic information, particularly for mathematical expressions. We propose Nougat (Neural Optical Understanding for Academic Documents), a Visual Transformer model that performs an Optical Character Recognition (OCR) task for processing scientific documents into a markup language, and demonstrate the effectiveness of our model on a new dataset of scientific documents.”
-
Regarding your point about a successor to LaTeX: https://typst.app/ is turning out to be great.
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
Most signatures (https://en.wikipedia.org/wiki/Section_(bookbinding)) are glue bounded, not just sewn. Some use cases prefer/require the pages to be unbound because the printing goes all the way into the gutter and cutting the spine can also leave out some data. It's highly inneficient as you have to heat the spine carefully and then remove the glue residues. A tiny glue leftover can smear your autofeed scanner if not completely jam and tear the page. For a unique item, makes sense using a non-destructive scanning method, but for anything else, a carefully cut spine (or better yet, a bookbinding plow https://duckduckgo.com/?t=palemoon&q=bookbinding+plow&iax=im... ) leave a perfect cut and the loose pages can be kept in a ziplog bag for any future reference.