paper-bidsheets
System to create paper auction bidsheets with Google Sheets and scan them using Ollama (by philips)
wordninja
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies. (by keredson)
paper-bidsheets | wordninja | |
---|---|---|
1 | 2 | |
7 | 841 | |
- | 1.9% | |
5.0 | 0.0 | |
4 months ago | about 2 years ago | |
Go | Python | |
- | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
paper-bidsheets
Posts with mentions or reviews of paper-bidsheets.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-11-15.
-
Llama-OCR: An Open-Source Llama 3.2 Based OCR Tool
I have recently used llama3.2-vision to handle some paper bidsheets for a charity auction and it is fairly accurate with some terrible handwriting. I hope to use it for my event next year.
I do find it rather annoying not being able to get it to consistently output a CSV though. ChatGPT and Gemini seem better at doing that but I haven’t tried to automate it.
The scale of my problem is about 100 pages of bidsheets and so some manual cleaning is ok. It is certainly better than burning volunteers time.
https://github.com/philips/paper-bidsheets
wordninja
Posts with mentions or reviews of wordninja.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-11-15.
-
Llama-OCR: An Open-Source Llama 3.2 Based OCR Tool
WordNinja is pretty good as a post-processing step on wrongly split/concatenated words:
[0]: https://github.com/keredson/wordninja
- Probabilistically split concatenated words using NLP based on English Wikipedia
What are some alternatives?
When comparing paper-bidsheets and wordninja you can also consider the following projects:
llama-ocr - Document to Markdown OCR library with Llama 3.2 vision
OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
nougat - Implementation of Nougat Neural Optical Understanding for Academic Documents
zerox - OCR & Document Extraction using vision models