open-parse
Improved file parsing for LLM’s (by Filimoa)
textract-ai
TextractAI: Extract and process text from PDFs using Python, OpenAI API, and OCR techniques. (by lifeiswilde)
open-parse | textract-ai | |
---|---|---|
3 | 1 | |
1,782 | 7 | |
- | - | |
9.2 | 6.7 | |
9 days ago | about 2 months ago | |
Python | Python | |
MIT License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
open-parse
Posts with mentions or reviews of open-parse.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-04-07.
- Show HN: Beyond text splitting – improved file parsing for LLM's
-
Running OCR against PDFs and images directly in the browser
I recently built a similar tool except it’s configured to use some deep learning libraries for the table extraction. I’m excited to integrate unitable which has state of the art performance later this week.
I built this because most of the basic layout detection libraries have terrible performance on anything non trivial. Deep learning is really the long term solution here.
https://github.com/Filimoa/open-parse
- Show HN: Open-source, high performance document chunking for LLM's
textract-ai
Posts with mentions or reviews of textract-ai.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-30.
-
Running OCR against PDFs and images directly in the browser
This is cool! I built something similar but it's CLI based. [1] https://github.com/lifeiswilde/textract-ai