SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Jupyter Notebook Dataset Projects
-
covid-chestxray-dataset
We are building an open database of COVID-19 cases with chest X-ray or CT images.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
whylogs
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
-
datasets
🎁 5,400,000+ Unsplash images made available for research and machine learning (by unsplash)
Here's a live demo with a simple React frontend. It's searching against an S3 bucket containing Unsplash's open source dataset of 25,000 images, plus a few of my own.
-
-
clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
-
raccoon_dataset
The dataset is used to train my own raccoon detector and I blogged about it on Medium
-
torchxrayvision
TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
-
ThoughtSource
A central, open resource for data and tools related to chain-of-thought reasoning in large language models. Developed @ Samwald research group: https://samwald.info/
-
hate-speech-and-offensive-language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
-
Project mention: Simple Implementation of OpenAI Clip (Tutorial) | news.ycombinator.com | 2024-02-21
-
-
SKAB
SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.
-
Awesome_Satellite_Benchmark_Datasets
Supplementary material for our paper "THERE IS NO DATA LIKE MORE DATA" is provided.
-
-
roboflow-100-benchmark
Code for replicating Roboflow 100 benchmark results and programmatically downloading benchmark datasets
-
-
-
-
mnist1d
A 1D analogue of the MNIST dataset for measuring spatial biases and answering Science of Deep Learning questions.
-
-
-
medmcqa
A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.
-
Jupyter Notebook Dataset discussion
Jupyter Notebook Dataset related posts
-
Simple Implementation of OpenAI Clip (Tutorial)
-
SKAB: NEW Data - star count:238.0
-
SKAB: NEW Data - star count:238.0
-
SKAB: NEW Data - star count:238.0
-
SKAB: NEW Data - star count:238.0
-
Update from Waymo spokesperson on the dog that was killed by a Waymo ADV
-
[P] Fine-tuning LLaMA on TheVault by AI4Code
-
A note from our sponsor - SaaSHub
www.saashub.com | 3 Dec 2024
Index
What are some of the best open-source Dataset projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | covid-chestxray-dataset | 2,998 |
2 | whylogs | 2,657 |
3 | datasets | 2,443 |
4 | fma | 2,212 |
5 | clusterdata | 1,619 |
6 | raccoon_dataset | 1,268 |
7 | torchxrayvision | 936 |
8 | ThoughtSource | 899 |
9 | hate-speech-and-offensive-language | 779 |
10 | OpenAI-CLIP | 640 |
11 | TACO | 603 |
12 | SKAB | 328 |
13 | Awesome_Satellite_Benchmark_Datasets | 323 |
14 | covid19za | 255 |
15 | roboflow-100-benchmark | 248 |
16 | ImageNetV2 | 240 |
17 | alis | 243 |
18 | goodreads | 250 |
19 | mnist1d | 199 |
20 | clip-italian | 180 |
21 | openbrewerydb | 179 |
22 | medmcqa | 174 |
23 | Tegridy-MIDI-Dataset | 158 |