Show HN: How do you OCR on a Mac using the CLI or just Python for free

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

doctr

12 3,183 9.0 Python

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Tesseract is widely known to be "meh" at this point.
If you look at RAG frameworks as one example they'll typically use/support a variety of implementations. Tesseract is almost always supported but it's rarely ideal with projects like Unstructured[0] and DocTR[1] being preferred. By leveraging more-or-less SOTA vision models[2][3] they embarrass Tesseract.
I haven't compared them to the Apple Vision framework but they're absolutely better than Tesseract and potentially even Apple Vision.
[0] - https://github.com/Unstructured-IO/unstructured-inference
[1] - https://github.com/mindee/doctr
[2] - https://github.com/mindee/doctr#models-architectures
[3] - https://github.com/Unstructured-IO/unstructured-inference#mo...

ocrmac

1 140 6.8 Jupyter Notebook

A python wrapper to extract text from images on a mac system. Uses the vision framework from Apple.

Nice post, OP! I was super impressed with the Apple's vision framework. I used it on a personal project involving the OCRing of tens of thousands of spreadsheet screenshots and ingesting them into a postgres database.
I used a combination of RHetTbull's vision.py (for the actual implementation) [1] + ocrmac (for experimentation) [2] and was pleasantly surprised by the performance on my i7 6700k hackintosh.
I wouldn't call myself a programmer but I can generally troubleshoot anything if given enough time, but it did cost time.
[1]: https://gist.github.com/RhetTbull/1c34fc07c95733642cffcd1ac5...
[2]: https://github.com/straussmaximilian/ocrmac

Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
unstructured-inference

1 119 8.7 Python

Tesseract is widely known to be "meh" at this point.
If you look at RAG frameworks as one example they'll typically use/support a variety of implementations. Tesseract is almost always supported but it's rarely ideal with projects like Unstructured[0] and DocTR[1] being preferred. By leveraging more-or-less SOTA vision models[2][3] they embarrass Tesseract.
I haven't compared them to the Apple Vision framework but they're absolutely better than Tesseract and potentially even Apple Vision.
[0] - https://github.com/Unstructured-IO/unstructured-inference
[1] - https://github.com/mindee/doctr
[2] - https://github.com/mindee/doctr#models-architectures
[3] - https://github.com/Unstructured-IO/unstructured-inference#mo...

aichat

18 3,046 9.7 Rust

All-in-one AI CLI tool that integrates 20+ AI platforms, including OpenAI, Azure-OpenAI, Gemini, Claude, Mistral, Cohere, VertexAI, Bedrock, Ollama, Ernie, Qianwen, Deepseek...

use LLMs (gpt-4-vision or LLaVA) with aichat
`aichat -f tmp/test.png -- output only text in the image`
https://github.com/sigoden/aichat

Camelot

10 2,712 6.6 Python

A Python library to extract tabular data from PDFs

I had good repeated success extracting tables from PDFs using Camelot (Python, https://github.com/camelot-dev/camelot)

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

LlamaCloud and LlamaParse

9 projects | news.ycombinator.com | 20 Feb 2024
Pdfsandwich

6 projects | news.ycombinator.com | 6 Nov 2021
exporting handwritten dataset as text, export it and use it as a csv

3 projects | /r/RemarkableTablet | 16 Sep 2021
CSS Written in Pure Go

2 projects | news.ycombinator.com | 1 Jun 2024
FLaNK-AIM: 20 May 2024 Weekly

28 projects | dev.to | 20 May 2024

Show HN: How do you OCR on a Mac using the CLI or just Python for free

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Specific Formats Processing OCR AI PDF Deep Learning
Post date: 2 Jan 2024

doctr

ocrmac

Scout Monitoring

unstructured-inference

aichat

Camelot

InfluxDB

Related posts

LlamaCloud and LlamaParse

Pdfsandwich

exporting handwritten dataset as text, export it and use it as a csv

CSS Written in Pure Go

FLaNK-AIM: 20 May 2024 Weekly

Show HN: How do you OCR on a Mac using the CLI or just Python for free

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Specific Formats Processing OCR AI PDF Deep Learning Post date: 2 Jan 2024

doctr

ocrmac

Scout Monitoring

unstructured-inference

aichat

Camelot

InfluxDB

Related posts

LlamaCloud and LlamaParse

Pdfsandwich

exporting handwritten dataset as text, export it and use it as a csv

CSS Written in Pure Go

FLaNK-AIM: 20 May 2024 Weekly

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Specific Formats Processing OCR AI PDF Deep Learning
Post date: 2 Jan 2024