Jupyter Notebook Datasets

Open-source Jupyter Notebook projects categorized as Datasets

Top 11 Jupyter Notebook Dataset Projects

  • indonlu

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

  • cleora

    Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • SKAB

    SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.

  • Project mention: SKAB: NEW Data - star count:238.0 | /r/algoprojects | 2023-09-25
  • Tegridy-MIDI-Dataset

    Tegridy MIDI Dataset for precise and effective Music AI models creation.

  • ekya

    Source code and datasets for Ekya, a system for continuous learning on the edge.

  • artificial-self-AMLD-2020

    Workshop material for the AMLD 2020 workshop on "Meet your Artificial Self: Generate text that sounds like you"

  • openfema-samples

    Code, dataset, and analysis samples that utilize the OpenFEMA API.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • intel-processors

    Datasets for All Processors Maufactured By Intel

  • Project mention: Get CPU max turbo freq? API for CPU specs? | /r/PowerShell | 2023-06-22
  • parsee-datasets

    Datasets, case studies and benchmarks for extracting structured information from PDFs, HTML files or images, created by the Parsee.ai team. Datasets also on Hugging Face: https://huggingface.co/parsee-ai

  • Project mention: Parsee.ai – a framework to easily extract complex structured data with LLMs | news.ycombinator.com | 2024-03-31

    Yes, another LLM framework. This one is specialized on extracting structured data from various document types (mainly PDFs, images and HTML files).

    Comes with a new (separate) PDF extraction library that is focused on the extraction of numeric tables (tables with numbers, so especially for the financial domain): https://github.com/parsee-ai/parsee-pdf-reader

    Helps to easily set up a dataset to evaluate the performance of various LLMs on data extraction tasks, e.g. extracting revenue figures from financial reports: https://github.com/parsee-ai/parsee-datasets/tree/main/datas...

  • Data-Science-Data-Analystics-Contribution---Hacktoberfest-2022

    About Submit Just 4 PRs to earn Tshirts🔥 in Hacktoberfest 2022

  • ProTaska-GPT

    Unleash the Potential of Datasets with Intelligent Tasks, Tutorials, and Algorithm Recommendations.

  • Project mention: Learn Data Science with a GPT-powered Tutor: ProTaska-GPT | /r/learnpython | 2023-06-19
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Jupyter Notebook Datasets related posts

Index

What are some of the best open-source Dataset projects in Jupyter Notebook? This list will help you:

Project Stars
1 indonlu 490
2 cleora 472
3 SKAB 291
4 Tegridy-MIDI-Dataset 123
5 ekya 94
6 artificial-self-AMLD-2020 80
7 openfema-samples 20
8 intel-processors 15
9 parsee-datasets 10
10 Data-Science-Data-Analystics-Contribution---Hacktoberfest-2022 5
11 ProTaska-GPT 2

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com