gImageReader
percollate
Our great sponsors
gImageReader | percollate | |
---|---|---|
15 | 14 | |
1,519 | 4,108 | |
- | - | |
7.8 | 5.9 | |
28 days ago | 3 months ago | |
C++ | JavaScript | |
GNU General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gImageReader
-
Making an archive out of my grandfather's writings. What OCR scanning and doc mgt system to use?
On tesseract base here is a software to make a scan a text searchable pdf. It take a bit of time and can be a bit tedious but it does the work! https://github.com/manisandro/gImageReader/releases It does not work well on cursive writing of course. It's a bit less heavy code sided solution. Good luck!
- Is there free software for windows that can read scanned handwriting and turn it into text?
- أحمل برنامج صخر منين؟ دورت عليه كتير مش لاقياه؟ ولو مش موجود حد يعرف أي بديل كويس بيعمل Arabic OCR؟
-
Writer - Tips to remove breaks and hyphenations from PDF to DOC conversion?
I'm working with old newspaper PDFs to convert them into DOC formats. I'm having a great time with gImageReader by highlighting columns and converting them to plain text. Then I take that plain text into Libreoffice Writer (7.0.4.2) to clean up and save. If this were a book as opposed to a newspaper with ads and columns, it would have bee a lot easier to convert and format.
-
Best OCR software for extracting pdf to txt - Paid or Free version.
It would help to know a bit more of your usecase. If you're looking to just extract the text (ie, take all the textual content of your PDF and drop it into a separate text document), there are solutions like ABBYY Finereader and gImageReader. If you're looking to make PDFs searchable (keeping the scanned pages, but adding a text layer underneath so you can search and copy from them), there's NAPS2 (which has an additional command line tool for automation) and OCRmyPDF.
-
Help plz! Tool to enhance pdf text quality?
OpenSource OCR... for desktop users I like "gImageReader" URL: https://github.com/manisandro/gImageReader (Technically is GUI for tessaract)
-
Good Open Source OCR software
gImageReader is the linux standard that I'm aware of. It's a GUI to Tessaeract, but IIRC you can use other models if you have them.
-
What Are The Best Linux Apps?
gImageReader as a simple OCR application
-
OCR Arabic screenshot clipboard captures for Mac
https://github.com/manisandro/gImageReader ^^ seems like it has installers for different OS's
- Is there a good/accurate OCR/Text to Image program available?
percollate
-
The Case Against AI Everything, Everywhere, All at Once
You can still choose automation. The easier route for me is to use wallabag to save the article. Then on my remarkable tablet I can grab a very readable document with https://github.com/koreader/koreader.
The other option is to use https://github.com/danburzo/percollate to convert a webpage to a nice document directly. I use both tools depending on my needs.
-
Share my down(load) function!
This function is just a simple combination with yt-dlp and percollate.
- Selfhosted service to screenshot websites - but I'm not finding the options I need
-
Reverse Engineering or Recreating the Chrome Extension?
If someone hasn't already done this and I can't figure out how they are converting HTML, I have also considered using Percollate to convert, then sending to ReMarkable via rmapi.
-
ArchiveBox Alternative
The Cli Tool Percollate offers a different approach, but is also very good: https://github.com/danburzo/percollate
- Reading web articles on the reMarkable
-
Is there a command line program to convert web pages into readable markdown/htm/pdf format? preferably markdown
Concerning pdf there is the well known wkhtmltopdf , but let me say that I love the not so well known percollate
- CLI to turn web pages into beautiful, readable PDF, ePub, or HTML docs
-
Show HN: Lurnby, a tool for better learning, is now open source
Since I'm working on a similar project, this is how I am planning to pull content from the web, utilizing percollate[1] to get the HTML content, I haven't written any implementation for this in Python yet.
If you don't mind me asking, how were you going to implement spaced repetition? Since the Incremental Reading algorithm has never been published as far as I know.
[1]: https://github.com/danburzo/percollate
- What Are The Best Linux Apps?
What are some alternatives?
PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
rdrview - Firefox Reader View as a command line tool
tesseract - Tesseract Open Source OCR Engine (main repository)
koodo-reader - A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file
docker-teedy - Multi-architecture Dockerfile for Teedy (formerly Sismics Docs)
zimit - Make a ZIM file from any Web site and surf offline!
webapp-manager
monolith-of-web - A chrome extension to make a single static HTML file of the web page using a WebAssembly port of monolith CLI
warpinator - Share files across the LAN
BasicCrawler - Basic web crawler that automates website exploration and producing web resource trees.