webdataset
Made-With-ML
Our great sponsors
webdataset | Made-With-ML | |
---|---|---|
7 | 51 | |
1,962 | 35,656 | |
7.4% | - | |
8.8 | 6.8 | |
17 days ago | 5 months ago | |
Python | Jupyter Notebook | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
webdataset
-
How to use data stored in a (private) S3 Bucket for training?
As an alternative, I've looked into using WebDataset, but couldn't figure out how to access data that is stored in a private bucket.
- [D] Title: Best tools and frameworks for working with million-billion image datasets?
-
[D] Training networks on extremely large datasets (10+TB)?
You can try webdataset (https://github.com/webdataset/webdataset).
- Question: TIFF image dataset - size in RAM.
-
How to upload large amounts of data to a server?
compress it to .tar format and then load it as a webdataset
-
Does mit 6.824 help for distributed deep learning?
Would guess not but there should be some good niche resources: check out the introductory videos here https://github.com/webdataset/webdataset
-
How to effectively load a large text dataset with PyTorch?
I found a pretty good solution that is similar to the TFRecord from Tensorflow. You just need to load the data, tokenized it, and save the arrays in shards with webdataset package.
Made-With-ML
-
[D] How do you keep up to date on Machine Learning?
Made With ML
- Open-Source Production Machine Learning Course
-
Advice for switching careers within analytics
- Develop a (simple!) ML project and apply MLOps best practices to it. Ask Chat GPT all of your MLOps questions. I've joined this MLOps community and it has been very helpful to know what path to follow in order to be better at MLOps, thanks to them I arrived at madewithml, but I haven't done it yet. But it covers all the MLOps side.
-
Recommendation for MLOps resources
Hey, I’m also working in ML. Here’s a great resource: https://madewithml.com. Also, check out Noah Gift’s book Practical MLOPs.
- Ask HN: Resource to learn how to train and use ML Models
-
Need help to find resources to learn ml ops
Try replicating this setup: https://madewithml.com/
-
MLops Resources
madewithml
-
Ask HN: How do I get started with MLOps?
There's a really nice website by Goku Mohandas called Made With ML. IMO it is the best practical guide to MLOps out there: https://madewithml.com
Incase you want to dive a little deeper, https://fullstackdeeplearning.com/course/2022/ is also something I have been recommended by folks.
- Resources for Current DE Interested in Learning Data Science
-
Do organizations still need machine learning engineers?
madewithml is pretty sweet, especially the MLOps side of things. It'll give you good skills in how development in Python and deploying ML works.
What are some alternatives?
Practical_RL - A course in reinforcement learning in the wild
zero-to-mastery-ml - All course materials for the Zero to Mastery Machine Learning and Data Science course.
NYU-DLSP20 - NYU Deep Learning Spring 2020
mlops-zoomcamp - Free MLOps course from DataTalks.Club
ffcv - FFCV: Fast Forward Computer Vision (and other ML workloads!)
FLAML - A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
fastai - The fastai deep learning library
mlops-course - Learn how to design, develop, deploy and iterate on production-grade ML applications.
ModelNet40-C - Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
practical-mlops-book - [Book-2021] Practical MLOps O'Reilly Book
PySyft - Perform data science on data that remains in someone else's server
Copulas - A library to model multivariate data using copulas.