open-parse
Improved file parsing for LLM’s (by Filimoa)
s3-ocr
Tools for running OCR against files stored in S3 (by simonw)
open-parse | s3-ocr | |
---|---|---|
3 | 1 | |
1,782 | 108 | |
- | - | |
9.2 | - | |
9 days ago | over 1 year ago | |
Python | Python | |
MIT License | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
open-parse
Posts with mentions or reviews of open-parse.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-04-07.
- Show HN: Beyond text splitting – improved file parsing for LLM's
-
Running OCR against PDFs and images directly in the browser
I recently built a similar tool except it’s configured to use some deep learning libraries for the table extraction. I’m excited to integrate unitable which has state of the art performance later this week.
I built this because most of the basic layout detection libraries have terrible performance on anything non trivial. Deep learning is really the long term solution here.
https://github.com/Filimoa/open-parse
- Show HN: Open-source, high performance document chunking for LLM's
s3-ocr
Posts with mentions or reviews of s3-ocr.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-30.
-
Running OCR against PDFs and images directly in the browser
My s3-ocr tool can do that with quite a bit of extra configuration.
https://github.com/simonw/s3-ocr