scantailor-advanced
paperless-ngx
scantailor-advanced | paperless-ngx | |
---|---|---|
21 | 212 | |
1,107 | 16,882 | |
- | 3.1% | |
0.0 | 9.9 | |
8 months ago | 1 day ago | |
C++ | Python | |
GNU General Public License v3.0 only | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scantailor-advanced
-
Z-Library to Let Users Share Physical Books
There's also https://scantailor.org/ (and a maintained fork at https://github.com/4lex4/scantailor-advanced ) which semi-automates unwarping and other corrective tasks in scanned books.
- Protip : Scannez et classez tout vos documents. Maintenant.
-
Looking for freeware to scan multiple photos and autocrop. Straightening would be a plus
ScanTailor Advanced (downloads are in the sidebar, labeled as "Releases") is a little more complex. It takes a folder of images and performs a series of steps to produce good book scans, though it might be useful for your purpose as well. The first step after importing images is to split them up, and ScanTailor does this by looking for straight lines that might indicate a gap between pages. Now, I don't know if this will handle any more than two pictures per image, but it will at least handle doubles. ScanTailor will also attempt to automatically deskew pictures, and then there are a few steps that you'll need to take to make your pictures come out nicely.
- Whats the best software to split multiple scanned-at-once photos apart?
-
I’m looking for a OCR software that scan text
My preferred method is to take pictures of all the pages of the book (Open Camera has a nice option to take a new picture every n seconds), optionally touch them up with ScanTailor (automated), and then turn all the images to a PDF using NAPS2 (which will OCR the text as it goes in).
-
Tutorial on book digitization
Next is the cleanup. Scan Tailor is the best game in town for this, but it's a dead project. Instead, there are two forks that have picked up where the original developers left off. Scan Tailor Advanced is my current fork of choice, though Scan Tailor Universal tries to add new usability features. For whatever reason, only Advanced makes full use of my CPU, so it's several times faster than Universal for the time being.
- Program for book digitisation/scanning?
-
Request Help-How to batch split two pages (misscanned onto 1 page) into two pdf pages, one pdf page for each page image? Any suggestions on other software products to assist?
No problem. https://github.com/4lex4/scantailor-advanced/releases
-
https://np.reddit.com/r/opensource/comments/ou4y5h/is_there_an_open_source_program_to_reduce_the/h72nl4g/
If you are interested in more ways to treat a scan, ScanTailor is definitely an option. I am using a fork called ScanTailor Advanced which is available under GPL-3.0 on GitHub. With this tool you can also crop your images and apply options like threshold or posterization to improve readability while further reducing the file size.
-
Tesseract OCR
I use a £15 arm with a vice grip for my phone from Amazon, copy the files to my laptop and then run a bash for-loop of the tesseract CLI over the resultant files.
I use https://github.com/4lex4/scantailor-advanced to deskew the images and generate the PDF.
It isn't perfect but my purposes are more around research than publication, so, YMMV!
paperless-ngx
-
I accidentally built a meme search engine
I steered a friend towards Paperless (and away from an LLM solution) as a way of searching/accessing GBs of architectural PDFs recently - so far, it’s apparently working well for them.
https://github.com/paperless-ngx/paperless-ngx
-
🔍Underrated Open Source Projects You Should Know About 🧠
Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can find your physical documents easier. With features such as tags, full text search, multi-user permissions system, this is a dream for those who like to keep an organized folder of files and documents.
- Paperless-Ngx
-
Home Lab Guide
Since last year I’ve been configuring and maintaining my homelab setup and it is just amazing.
I’ve learned so much about containers, virtual machines and networking. Some of the self hosted applications like paperless-ngx [1] and immich [2] are much superior in terms of features than the proprietary cloud solutions.
With the addition of VPN services like tailscale [3] now I can access my homelab from anywhere in the world.
The only thing missing is to setup a low powered machine like NUC or any mini PC so I can offload the services I need 24/7 and save electricity costs.
If you can maintain it and have enough energy on weekends to perform routine maintenance and upgrades. I would 100% recommend setting up your own homelab.
[1] https://docs.paperless-ngx.com/
-
Ask HN: What Underrated Open Source Project Deserves More Recognition?
This has been posted a few times already, but I cannot tell you how life changing Paperless NGX is for organizing PDFs. As someone who wrangles all of the insurance and bills for my house, this open source software is so damn good.
https://docs.paperless-ngx.com/
I maintain Bash script to quickly set it up locally on Linux with Podman. Give it a spin if you want to kick the tires.
https://github.com/jdoss/ppngx
- Daily Price Tracking for Trader Joes
-
Taking (Back?) My Internet Privacy and Presence
Personally, I use https://github.com/joeyates/imap-backup to archive all my emails and then only keep them on the remote server for as long as I need to (basically until I read them and respond or download an attachment into https://docs.paperless-ngx.com )
- Paperless-NGX: transform your physical documents into a searchable archive
- Paperless-ngx: open-source document management system
What are some alternatives?
scantailor-universal - ScanTailor Universal - a fork based on Enhanced+Featured+Master versions of ST
Papermerge - Open Source Document Management System for Digital Archives (Scanned Documents)
EasyOCR - Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Paperless-ng - A supercharged version of paperless: scan, index and archive all your physical documents
bookscan - Documentation and scripts for book scanning using free software tools
Docspell - Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Mayan EDMS - Free Open Source Document Management System (mirror, no pull request or issues)
Tesseract.js - Pure Javascript OCR for more than 100 Languages 📖🎉🖥
Nextcloud - ☁️ Nextcloud server, a safe home for all your data
pi-scan - Pi Scan is a simple, robust capture appliance for book scanners. It runs on a Raspberry Pi 2.
Nginx Proxy Manager - Docker container for managing Nginx proxy hosts with a simple, powerful interface