gds_course
dcai-lab
gds_course | dcai-lab | |
---|---|---|
1 | 10 | |
86 | 401 | |
- | 3.2% | |
3.5 | 5.4 | |
5 months ago | 4 months ago | |
Jupyter Notebook | Jupyter Notebook | |
- | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gds_course
-
Who are your data science heroes?
Dani Aribas Bel (modern legend in the making, check out his open source Geospatial Data Science course )
dcai-lab
-
Resources to learn practical/industry-focused ML (preferably using TensorFlow)?
Data-Centric AI honestly if you've been working on ML pipelines this might be familiar to you
-
Andrew NG, github courses
Another great resource inspired by the Andrew Ng data-centric AI movement is the Introduction to Data-Centric AI course taught this past semester at MIT by PhDs.
-
Good Beginner Courses for ML?
Data-centric AI course. Brand new, taught the 1st time a few months ago by MIT PhD grads. This covers how to ensure good data quality for your models. More data science havy.
-
[P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions.
Thanks for the kind words! Make sure to check out the current open MIT course if you are just starting out: https://dcai.csail.mit.edu/
-
The Missing Semester of Your CS Education
Introduction to Data-Centric AI https://dcai.csail.mit.edu
- Introduction to Data-Centric AI
-
MIT Introduction to Data-Centric AI
Course homepage | Lecture videos on YouTube | Lab Assignments
What are some alternatives?
chrisalbon_com - Code for ChrisAlbon.com
snorkel - A system for quickly generating training data with weak supervision
cleanlab - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
BotLibre - An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
deodel - A mixed attributes predictive algorithm implemented in Python.
chordviz - A convolutional neural network trained using PyTorch to predict the next chord (as tablature) on a guitar based on image data. Includes labeling software for the image data as well as an iOS app for hosting and running the model.
UBB-INFO - All projects from university.
nodevectors - Fastest network node embeddings in the west
dgl - Python package built to ease deep learning on graph, on top of existing DL frameworks.
dcai-course - Introduction to Data-Centric AI, MIT IAP 2023 🤖
refinery - The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.