data-centric-AI
data-centric-ai
data-centric-AI | data-centric-ai | |
---|---|---|
4 | 1 | |
990 | 1,070 | |
- | 1.4% | |
4.5 | 0.0 | |
13 days ago | 5 months ago | |
TeX | ||
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data-centric-AI
- Data quality could be the key behind GPT-4, not model
-
[R] Data-centric Artificial Intelligence: A Survey
Found relevant code at https://github.com/daochenzha/data-centric-AI + all code implementations here
- Data-centric AI resources
data-centric-ai
-
[P] Rubrix: Open-source Python framework for NLP data annotation, exploration, and monitoring
In line with initiatives like Data-centric AI (https://https-deeplearning-ai.github.io/data-centric-comp/, https://github.com/HazyResearch/data-centric-ai), we firmly believe that iterating on datasets (finding label errors, dataset slicing, QA, etc.) will become more and more important, and tools for making this easier and involving different roles are needed.
What are some alternatives?
500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code - 500 AI Machine learning Deep learning Computer vision NLP Projects with code
argilla - Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
machine-learning-for-software-engineers - A complete daily plan for studying to become a machine learning engineer.
pytorch-lightning - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. [Moved to: https://github.com/PyTorchLightning/pytorch-lightning]
ml-visuals - 🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
prometheus-spec - Cryptoeconomically-safe trustless high-load computing on top of Bitcoin
MidJourney-Styles-and-Keywords-Reference - A reference containing Styles and Keywords that you can use with MidJourney AI. There are also pages showing resolution comparison, image weights, and much more!
pytorch-lightning - Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning]
refinery - The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
DikeDataset - Dataset with labeled benign and malicious files 🗃️
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python