dcai-lab
chordviz
dcai-lab | chordviz | |
---|---|---|
10 | 2 | |
400 | 4 | |
3.0% | - | |
5.4 | 6.9 | |
4 months ago | 11 months ago | |
Jupyter Notebook | Swift | |
GNU Affero General Public License v3.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dcai-lab
-
Resources to learn practical/industry-focused ML (preferably using TensorFlow)?
Data-Centric AI honestly if you've been working on ML pipelines this might be familiar to you
-
Andrew NG, github courses
Another great resource inspired by the Andrew Ng data-centric AI movement is the Introduction to Data-Centric AI course taught this past semester at MIT by PhDs.
-
Good Beginner Courses for ML?
Data-centric AI course. Brand new, taught the 1st time a few months ago by MIT PhD grads. This covers how to ensure good data quality for your models. More data science havy.
-
[P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions.
Thanks for the kind words! Make sure to check out the current open MIT course if you are just starting out: https://dcai.csail.mit.edu/
-
The Missing Semester of Your CS Education
Introduction to Data-Centric AI https://dcai.csail.mit.edu
- Introduction to Data-Centric AI
-
MIT Introduction to Data-Centric AI
Course homepage | Lecture videos on YouTube | Lab Assignments
chordviz
-
The Matrix Calculus You Need for Deep Learning
Here’s an ML project I’ve been working on as a solo dev:
https://github.com/williamcotton/chordviz
Labeling software in React, CNN in PyTorch, prediction on app in SwiftUI. 12,000 and counting hand labeled images of my hand on a guitar fretboard!
-
Introduction to Data-Centric AI
I have an ML project I started that involved manually labeling around 10,000 still frames of my hand on a guitar fretboard playing various chord shapes. I made a little web app with a keyboard interface for quickly adding labels to images. I got up to around an image a second when I got in the zone. I finished the dataset, got distracted by the birth of my son, and have literally done none of the “fun stuff” yet!
If anyone wants to have a crack at the data, it’s in git-lfs and here:
https://github.com/williamcotton/chordviz
What are some alternatives?
snorkel - A system for quickly generating training data with weak supervision
MatrixForensics - Collection of Matrix/Linear Algebra Information
cleanlab - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
BotLibre - An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
deodel - A mixed attributes predictive algorithm implemented in Python.
UBB-INFO - All projects from university.
nodevectors - Fastest network node embeddings in the west
dgl - Python package built to ease deep learning on graph, on top of existing DL frameworks.
dcai-course - Introduction to Data-Centric AI, MIT IAP 2023 🤖
refinery - The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
optuna - A hyperparameter optimization framework