webdataset
fastai
Our great sponsors
webdataset | fastai | |
---|---|---|
7 | 9 | |
1,962 | 25,610 | |
7.4% | 1.3% | |
8.8 | 8.0 | |
17 days ago | 5 days ago | |
Python | Jupyter Notebook | |
BSD 3-clause "New" or "Revised" License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
webdataset
-
How to use data stored in a (private) S3 Bucket for training?
As an alternative, I've looked into using WebDataset, but couldn't figure out how to access data that is stored in a private bucket.
- [D] Title: Best tools and frameworks for working with million-billion image datasets?
-
[D] Training networks on extremely large datasets (10+TB)?
You can try webdataset (https://github.com/webdataset/webdataset).
- Question: TIFF image dataset - size in RAM.
-
How to upload large amounts of data to a server?
compress it to .tar format and then load it as a webdataset
-
Does mit 6.824 help for distributed deep learning?
Would guess not but there should be some good niche resources: check out the introductory videos here https://github.com/webdataset/webdataset
-
How to effectively load a large text dataset with PyTorch?
I found a pretty good solution that is similar to the TFRecord from Tensorflow. You just need to load the data, tokenized it, and save the arrays in shards with webdataset package.
fastai
-
Cleared AWS Machine Learning - Specialty exam.. Happy to help!!!
Jeremy Howard's YouTube Channel - Jeremy maintains the fastai library, which is an excellent package that will help anyone build complicated ML architectures in minimum time. His YouTube Channel has a number of free courses which do an amazing job of covering a variety of ML topics, and he also maintains a very active forum for people studying ML.
-
Coding your own AI in 2023 with fastai
To create the AI we will use fastai. This is a python library, which is build on top of pytorch. No worries, you don't need to know how to code python. We will learn how this stuff works along the way :)
-
Fast.ai starts a corporate partnership program
You may know fast.ai as a popular deep learning course. There is also a deep learning library with the same name (https://github.com/fastai/fastai) as well as software development tools like nbdev (https://nbdev.fast.ai/).
fast.ai has been offering education and tools for free for over 7 years, and has been approached by many companies asking for help. This program offers an avenue for business to get relevant professional services and support.
-
People tricking ChatGPT “like watching an Asimov novel come to life”
The "fastai" course is free, and does a really nice job walking you through building simple neural nets from the ground up:
https://github.com/fastai/fastai
What's going on here is the exact same thing, just much, much larger.
- Programação letrada com Jupyter Notebook e Nbdev
-
Why noone uses nbdev for library development?
Development NB: https://github.com/fastai/fastai/blob/master/nbs/09_vision.augment.ipynb
-
[D] What Repetitive Tasks Related to Machine Learning do You Hate Doing?
There is already a ton of momentum around automating ML workflows. I would suggest you contribute to a preexisting project like, for instance, PyTorch Lightning or fast.ai.
-
Good practices for neural network training: identify, save, and document best models
If you are unaware of what fastai is, its official description is:
-
D I Refuse To Use Pytorch Because Its A Facebook
Also, not a single docstring to document any code in the library - https://github.com/fastai/fastai/blob/master/fastai/vision/learner.py
What are some alternatives?
Practical_RL - A course in reinforcement learning in the wild
pytorch-lightning - Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning]
NYU-DLSP20 - NYU Deep Learning Spring 2020
fastbook - The fastai book, published as Jupyter Notebooks
Made-With-ML - Learn how to design, develop, deploy and iterate on production-grade ML applications.
Watermark-Removal-Pytorch - 🔥 CNN for Watermark Removal using Deep Image Prior with Pytorch 🔥.
ffcv - FFCV: Fast Forward Computer Vision (and other ML workloads!)
PySyft - Perform data science on data that remains in someone else's server
ModelNet40-C - Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
lego-mindstorms - My LEGO MINDSTORMS projects (using set 51515 electronics)
ru-dalle - Generate images from texts. In Russian