surya vs deepdoctection

surya

OCR, layout analysis, reading order, line detection in 90+ languages (by VikParuchuri)

AI Computer

Source Code

Suggest alternative

Edit details

deepdoctection

A Repo For Document AI (by deepdoctection)

document-parser document-image-analysis table-recognition OCR document-ai document-understanding Python document-layout-analysis table-detection Pytorch Tensorflow publaynet pubtabnet layoutlm NLP

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

surya		deepdoctection
	Project
6	Mentions	8
6,871	Stars	2,245
-	Growth	8.4%
8.4	Activity	9.2
4 days ago	Latest Commit	6 days ago
Python	Language	Python
GNU General Public License v3.0 only	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

surya

Posts with mentions or reviews of surya. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-07.

New open source AI model for document segmentation and unstructured ETL
1 project | news.ycombinator.com | 7 May 2024

Would this be able to incorporate the models from Surya —
https://github.com/VikParuchuri/surya
Show HN: Beyond text splitting – improved file parsing for LLM's
4 projects | news.ycombinator.com | 7 Apr 2024

This looks great! You might be interested in surya - https://github.com/VikParuchuri/surya (I'm the author). It does OCR (much more accurate than tesseract), layout analysis, and text detection.
The OCR is slow on CPU (working on it), but faster than tesseract (CPU-only) on GPU.
Happy to discuss more, feel free to email me (in profile).
LlamaCloud and LlamaParse
9 projects | news.ycombinator.com | 20 Feb 2024

You may want to try https://github.com/VikParuchuri/surya (I'm the author). I've only benchmarked against tesseract, but it outperforms it by a lot (benchmarks in repo). Happy to discuss.
You could also try https://github.com/VikParuchuri/marker for general PDF parsing (I'm also the author) - it seems like you're more focused on tables.
Show HN: Surya – OCR and line detection in 93 languages
1 project | news.ycombinator.com | 13 Feb 2024
Surya: Multilingual Document OCR Toolkit
1 project | news.ycombinator.com | 13 Jan 2024

1 project | news.ycombinator.com | 12 Jan 2024

deepdoctection

Posts with mentions or reviews of deepdoctection. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-07.

Show HN: Beyond text splitting – improved file parsing for LLM's
4 projects | news.ycombinator.com | 7 Apr 2024

https://github.com/deepdoctection/deepdoctection
Have you tried this ?
April 2023
40 projects | /r/dailyainews | 2 Jun 2023

DeepDoctection: Document extraction and analysis using deep learning models (https://github.com/deepdoctection/deepdoctection)
DeepDoctection: Document extraction and analysis using deep learning models
1 project | /r/programming | 27 Apr 2023

1 project | /r/patient_hackernews | 26 Apr 2023

1 project | /r/hackernews | 26 Apr 2023
DeepDoctection
1 project | /r/hypeurls | 26 Apr 2023

4 projects | news.ycombinator.com | 26 Apr 2023
[D] Can I use ML/AI to read the back panels of electronic components?
11 projects | /r/MachineLearning | 2 Jan 2023

deepdoctection/deepdoctection: A Repo For Document AI

What are some alternatives?

When comparing surya and deepdoctection you can also consider the following projects:

cmdf - this thing will fix misspelled commands by learning from your history.

DocumentInformationExtraction - Key Information Extraction From Documents: Evaluation And Generator

stable-diffusion-webui - Stable Diffusion web UI

Flowise - Drag & drop UI to build your customized LLM flow

Auto-GPT - An experimental open-source attempt to make GPT-4 fully autonomous. [Moved to: https://github.com/Significant-Gravitas/Auto-GPT]

CascadeTabNet - This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

PentestGPT - A GPT-empowered penetration testing tool

donut - Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

bark - 🔊 Text-Prompted Generative Audio Model

Information-extraction-from-document - Graph Key Information Extraction: GKIE

JARVIS - JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

surya vs cmdf deepdoctection vs DocumentInformationExtraction surya vs stable-diffusion-webui deepdoctection vs Flowise surya vs Auto-GPT deepdoctection vs CascadeTabNet deepdoctection vs PentestGPT deepdoctection vs donut deepdoctection vs bark deepdoctection vs Information-extraction-from-document deepdoctection vs JARVIS deepdoctection vs unstructured

Compare surya vs deepdoctection and see what are their differences.

surya

deepdoctection

surya

deepdoctection

What are some alternatives?