parsee-pdf-reader VS parsee-datasets

Compare parsee-pdf-reader vs parsee-datasets and see what are their differences.

parsee-pdf-reader

Parsee's PDF reader, specialized on the extraction of tables with numeric values and the accurate extraction and preservation of text-paragraphs. Full support for scans and images. (by parsee-ai)

parsee-datasets

Datasets, case studies and benchmarks for extracting structured information from PDFs, HTML files or images, created by the Parsee.ai team. Datasets also on Hugging Face: https://huggingface.co/parsee-ai (by parsee-ai)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
parsee-pdf-reader parsee-datasets
1 2
23 61
- -
8.3 6.4
5 days ago 6 days ago
Python Jupyter Notebook
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

parsee-pdf-reader

Posts with mentions or reviews of parsee-pdf-reader. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-31.
  • Parsee.ai – a framework to easily extract complex structured data with LLMs
    2 projects | news.ycombinator.com | 31 Mar 2024
    Yes, another LLM framework. This one is specialized on extracting structured data from various document types (mainly PDFs, images and HTML files).

    Comes with a new (separate) PDF extraction library that is focused on the extraction of numeric tables (tables with numbers, so especially for the financial domain): https://github.com/parsee-ai/parsee-pdf-reader

    Helps to easily set up a dataset to evaluate the performance of various LLMs on data extraction tasks, e.g. extracting revenue figures from financial reports: https://github.com/parsee-ai/parsee-datasets/tree/main/datas...

parsee-datasets

Posts with mentions or reviews of parsee-datasets. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-31.
  • FinRAG Datasets and Study
    1 project | news.ycombinator.com | 7 May 2024
    To test this, we created 3 different datasets, all based on the same selection of 1,156 randomly selected annual reports for the year 2023 of publicly listed US companies.

    The resulting (fully labeled) datasets contain a combined total of 10,404 rows, 37,536,847 tokens and 1,156 images and can be found on Github and Huggingface: https://github.com/parsee-ai/parsee-datasets/tree/main/datas...

    For our study, we are evaluating 8 state-of-the-art (M)LLMs on a subset of 100 reports with some interesting results.

  • Parsee.ai – a framework to easily extract complex structured data with LLMs
    2 projects | news.ycombinator.com | 31 Mar 2024
    Yes, another LLM framework. This one is specialized on extracting structured data from various document types (mainly PDFs, images and HTML files).

    Comes with a new (separate) PDF extraction library that is focused on the extraction of numeric tables (tables with numbers, so especially for the financial domain): https://github.com/parsee-ai/parsee-pdf-reader

    Helps to easily set up a dataset to evaluate the performance of various LLMs on data extraction tasks, e.g. extracting revenue figures from financial reports: https://github.com/parsee-ai/parsee-datasets/tree/main/datas...

What are some alternatives?

When comparing parsee-pdf-reader and parsee-datasets you can also consider the following projects:

ProTaska-GPT - Unleash the Potential of Datasets with Intelligent Tasks, Tutorials, and Algorithm Recommendations.

tiger - Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)

LangChain-SynData-RAG-Eval - LangChain, Llama2-Chat, and zero- and few-shot prompting are used to generate synthetic datasets for IR and RAG system evaluation

llm-chatbot-rag - A local LLM chatbot with RAG for PDF input files

instinct.cpp - instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG, Chatbot, Code interpreter) powered by language models. Call it langchain.cpp if you like.