Surya Alternatives

Similar projects and alternatives to surya

unstructured

12 6,415 9.8 HTML surya VS unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
marker

8 8,044 7.8 Python surya VS marker

Convert PDF to markdown quickly with high accuracy
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
deepdoctection

8 2,193 9.2 Python surya VS deepdoctection

A Repo For Document AI
open-webui

7 16,677 10.0 Svelte surya VS open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)
Parsr

7 5,656 4.6 JavaScript surya VS Parsr

Transforms PDF, Documents and Images into Enriched Structured Data
llmsherpa

6 943 6.6 Jupyter Notebook surya VS llmsherpa

Developer APIs to Accelerate LLM Projects
llama-hub

5 3,359 9.6 Jupyter Notebook surya VS llama-hub

Discontinued A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
open-parse

3 1,601 8.2 Python surya VS open-parse

Improved file parsing for LLM’s
cmdf

1 0 0.0 Python surya VS cmdf

this thing will fix misspelled commands by learning from your history.
llama_parse

2 827 9.1 Python surya VS llama_parse

Parse files for optimal RAG
stable-diffusion-webui

2,808 129,975 9.9 Python surya VS stable-diffusion-webui

Stable Diffusion web UI
unitable

1 119 4.6 Jupyter Notebook surya VS unitable

UniTable: Towards a Unified Table Foundation Model
Auto-GPT

104 72,359 9.8 Python surya VS Auto-GPT

Discontinued An experimental open-source attempt to make GPT-4 fully autonomous. [Moved to: https://github.com/Significant-Gravitas/Auto-GPT] (by Torantulino)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better surya alternative or higher similarity.

Suggest an alternative to surya

surya reviews and mentions

Posts with mentions or reviews of surya. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-07.

Show HN: Beyond text splitting – improved file parsing for LLM's
4 projects | news.ycombinator.com | 7 Apr 2024

This looks great! You might be interested in surya - https://github.com/VikParuchuri/surya (I'm the author). It does OCR (much more accurate than tesseract), layout analysis, and text detection.
The OCR is slow on CPU (working on it), but faster than tesseract (CPU-only) on GPU.
Happy to discuss more, feel free to email me (in profile).
LlamaCloud and LlamaParse
9 projects | news.ycombinator.com | 20 Feb 2024

You may want to try https://github.com/VikParuchuri/surya (I'm the author). I've only benchmarked against tesseract, but it outperforms it by a lot (benchmarks in repo). Happy to discuss.
You could also try https://github.com/VikParuchuri/marker for general PDF parsing (I'm also the author) - it seems like you're more focused on tables.
Show HN: Surya – OCR and line detection in 93 languages
1 project | news.ycombinator.com | 13 Feb 2024
Surya: Multilingual Document OCR Toolkit
1 project | news.ycombinator.com | 13 Jan 2024

1 project | news.ycombinator.com | 12 Jan 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 1 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →