awesome-data-centric-ai
Data_Science_Portfolio
awesome-data-centric-ai | Data_Science_Portfolio | |
---|---|---|
7 | 2 | |
303 | 0 | |
1.3% | - | |
3.2 | 7.6 | |
5 months ago | 11 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-data-centric-ai
-
Thoughts: Continue current degree with one year left, or start anew with degree apprenticeship
I would finish the degree anyway. It's only one year left. If teachers miss classes, I would disregard that and try to learn on my own, and then yes, I would move on to an internship (or even do It at the same time if it's possible). If you like, come as meet us at the Data-Centric AI Community and we can do some projects together :)
-
Data science projects
Definitely a lot of growth in the AI space, and it will evolve rapidly in the next few years. There several paid propositions at the Data-Centric AI Community discord, check them out.
-
I absolutely hate my internship
2: Tbh, quit (?) We have open jobs at the Data-Centric AI Community. Bonus points: you can vent there as much as you want
-
Prioritise Data Science Projects
Let me invite you to the Data-Centric AI Community we have several code along sessions and projects and a lot of beginners that are starting to learn DS that you can connect with.
-
Imbalanced data
If you need specific help with your project you can find me at the Data-Centric AI Community and we'll be happy to take a look and give you some tips to move forward :)
-
Building my first Porfolio
You can share with us your progress on the Data-Centric AI Community and ask someone to review it, we often do that with CVs as well and help each other out.
-
[Q] How to generate synthetic dataset for anomaly detection?
Maybe you can use a synthetic data generator and use your current dataset as input? I believe there are a lot of GAN-based models for this purpose out there. The ones listed on https://github.com/Data-Centric-AI-Community/awesome-data-centric-ai are mostly focused on structured data, but I'm sure there are similar packages for images.
Data_Science_Portfolio
-
I analyzed 200k comments to find reddit's favorite and least favorite cycling brands
You can find more detail about the project on my portfolio website here. You can find the code for the project here (though I still need to clean it up a bit).
-
Looking for feedback on a logistic regression model
I uploaded the jupyter notebook to my github here. The beginning explanation is long so feel free to skip that if you'd like (it was written for people less familiar with data/machine learning). Also, keep in mind that I'm not a professional data scientist (maybe one day!) so I'm sure my lack of professional data experience is probably obvious to most of you. I'm just trying to solve some of the problems I've run into during my career and I'm here to learn how to get better at doing that.
What are some alternatives?
ydata-synthetic - Synthetic data generators for tabular and time-series data
machine_learning_complete - A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
DataScienceProjects
walkalongs - Resources and solutions of various technologies that I am currently learning
data-science-notes - Notes of IBM Data Science Professional Certificate Courses on Coursera
data-analytics-project-template - A python project starter template for data-analytics and data-science.
Portfolio
fullnamematchscore-go - Generates a match score of two person names from 0-100, where 100 is the highest, on how closely two individual full names match. The scoring is based on a series of tests, algorithms, AI, and an ever-growing body of Machine Learning-based generated knowledge
awesome-generative-ai-companies - A curated list of Gеnerative AI companies, sorted by focus area and total fundraised amount.
COVID-US - Open benchmark dataset of COVID-19 related ultrasound imaging data, curated and systematically validated — Ensemble de données de référence ouvert d'imagerie échographique liées à la COVID-19, organisé et systématiquement validé