webdataset
NYU-DLSP20
Our great sponsors
webdataset | NYU-DLSP20 | |
---|---|---|
7 | 2 | |
1,962 | 6,625 | |
7.4% | - | |
8.8 | 6.1 | |
17 days ago | 3 months ago | |
Python | Jupyter Notebook | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
webdataset
-
How to use data stored in a (private) S3 Bucket for training?
As an alternative, I've looked into using WebDataset, but couldn't figure out how to access data that is stored in a private bucket.
- [D] Title: Best tools and frameworks for working with million-billion image datasets?
-
[D] Training networks on extremely large datasets (10+TB)?
You can try webdataset (https://github.com/webdataset/webdataset).
- Question: TIFF image dataset - size in RAM.
-
How to upload large amounts of data to a server?
compress it to .tar format and then load it as a webdataset
-
Does mit 6.824 help for distributed deep learning?
Would guess not but there should be some good niche resources: check out the introductory videos here https://github.com/webdataset/webdataset
-
How to effectively load a large text dataset with PyTorch?
I found a pretty good solution that is similar to the TFRecord from Tensorflow. You just need to load the data, tokenized it, and save the arrays in shards with webdataset package.
NYU-DLSP20
-
A collection of some of the best PyTorch courses for beginners to learn PyTorch online
And of course our NYU DL course 😉 https://github.com/Atcold/pytorch-Deep-Learning
-
Week 6 practicum notebook
I am going through week 6 practicum notebook. Can someone shed some light on the following code in train method:
What are some alternatives?
Practical_RL - A course in reinforcement learning in the wild
Real-time-Object-Detection-for-Autonomous-Driving-using-Deep-Learning - My Computer Vision project from my Computer Vision Course (Fall 2020) at Goethe University Frankfurt, Germany. Performance comparison between state-of-the-art Object Detection algorithms YOLO and Faster R-CNN based on the Berkeley DeepDrive (BDD100K) Dataset.
Made-With-ML - Learn how to design, develop, deploy and iterate on production-grade ML applications.
nlp-class - A Natural Language Processing course taught by Professor Ghassemi
ffcv - FFCV: Fast Forward Computer Vision (and other ML workloads!)
bitcoin_price_prediction - This project tries to prediction the bitcoin price with machine and deep learning.
fastai - The fastai deep learning library
ydata-profiling - 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
ModelNet40-C - Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
dl-colab-notebooks - Try out deep learning models online on Google Colab
PySyft - Perform data science on data that remains in someone else's server
ML-Workspace - 🛠All-in-one web-based IDE specialized for machine learning and data science.