llama2_aided_tesseract
tools
llama2_aided_tesseract | tools | |
---|---|---|
4 | 8 | |
204 | 1,371 | |
- | 0.1% | |
7.2 | 9.2 | |
10 months ago | 9 days ago | |
Python | Python | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama2_aided_tesseract
-
Standard Ebooks
I made a tool like that, and I bet with a more powerful LLM like GPT4, and perhaps a better baseline OCR tool (like GPT4 vision), it could work really well for this sort of thing:
https://github.com/Dicklesworthstone/llama2_aided_tesseract
- Use Llama2 to Improve the Accuracy of Tesseract OCR
- FLaNK Stack Weekly for 07August2023
- Show HN: Using LLama2 to Correct OCR Errors
tools
-
Ask HN: What are the best eBook authoring tools today?
This violates the "One Tool" constraint that OP requested, but the Standard Ebooks tool chain is available on Github for anyone interested: https://github.com/standardebooks/tools
-
Standard Ebooks
The code is GPL-3 and the templates are CC0: https://github.com/standardebooks/tools/blob/master/LICENSE....
Feel free to ask on the mailing list if you have any questions, more likely to be picked up there than in a random HN thread :)
- Hobbes: “Leviathan” in Modern English. Introduction
- Fish 3.4.0
-
Today I learned ePub is just HTML/CSS
I'll give a shoutout to some other excellent software.
The first is the "Standard Ebooks"[1] toolset, which is a suite of Python scripts to create, process, and build ebooks in all common formats. The results on the Standard Ebooks site speak for themselves. They're impeccable in every way, and far better than many big name, commercially produced efforts.
GitHub: https://github.com/standardebooks/tools
-
17-volume Arabian Nights available in its entirety at Project Gutenberg
This question comes up a lot. The source to our production pipeline is GPLed and freely available,[1] but the biggest part of why we produce good work is that we have a high quality manual of style.[2] Unfortunately, that second part is very specific to English, and that’s the difficult part to replicate for other languages.
[1] https://github.com/standardebooks/tools/
[2] https://standardebooks.org/manual/
What are some alternatives?
harlequin - The SQL IDE for Your Terminal.
epub3-samples - EPUB 3 Sample Documents
gorilla-cli - LLMs for your CLI
syncabook - 📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)
OpenBuddy - Open Multilingual Chatbot for Everyone
leech - Turn a story on certain websites into an ebook for convenient reading
CallCMLModel - An example on calling models deployed in CML
Sigil - Sigil is a multi-platform EPUB ebook editor
EverythingApacheNiFi - EverythingApacheNiFi
ebook-diffuser - An end to end, customizable, ebook automation tool
fuzzy-matcher - A Java library to determine probability of objects being similar.
PyQtGraph - Fast data visualization and GUI tools for scientific / engineering applications