awesome-data-centric-ai
ydata-synthetic
awesome-data-centric-ai | ydata-synthetic | |
---|---|---|
7 | 60 | |
303 | 1,297 | |
1.3% | 3.2% | |
3.2 | 7.3 | |
5 months ago | 2 days ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-data-centric-ai
-
Thoughts: Continue current degree with one year left, or start anew with degree apprenticeship
I would finish the degree anyway. It's only one year left. If teachers miss classes, I would disregard that and try to learn on my own, and then yes, I would move on to an internship (or even do It at the same time if it's possible). If you like, come as meet us at the Data-Centric AI Community and we can do some projects together :)
-
Data science projects
Definitely a lot of growth in the AI space, and it will evolve rapidly in the next few years. There several paid propositions at the Data-Centric AI Community discord, check them out.
-
I absolutely hate my internship
2: Tbh, quit (?) We have open jobs at the Data-Centric AI Community. Bonus points: you can vent there as much as you want
-
Prioritise Data Science Projects
Let me invite you to the Data-Centric AI Community we have several code along sessions and projects and a lot of beginners that are starting to learn DS that you can connect with.
-
Imbalanced data
If you need specific help with your project you can find me at the Data-Centric AI Community and we'll be happy to take a look and give you some tips to move forward :)
-
Building my first Porfolio
You can share with us your progress on the Data-Centric AI Community and ask someone to review it, we often do that with CVs as well and help each other out.
-
[Q] How to generate synthetic dataset for anomaly detection?
Maybe you can use a synthetic data generator and use your current dataset as input? I believe there are a lot of GAN-based models for this purpose out there. The ones listed on https://github.com/Data-Centric-AI-Community/awesome-data-centric-ai are mostly focused on structured data, but I'm sure there are similar packages for images.
ydata-synthetic
-
Coding Wonderland: Contribute to YData Profiling and YData Synthetic in this Advent of Code
Send us your North ⭐️: "On the first day of Christmas, my true contributor gave to me..." a star in my GitHub tree! 🎵 If you love these projects too, star ydata-profiling or ydata-synthetic and let your friends know why you love it so much!
- ydata-synthetic: NEW Data - star count:1083.0
-
I absolutely hate my internship
1: Try to work with what you have and augment your dataset (honestly, 10 points is crap)
-
Assessing the Quality of Synthetic Data with Data-Centric AI
Data Quality is key for all applications and models, and LLMs are no exception :) I've been working on a small community project with synthetic data (https://github.com/ydataai/ydata-synthetic) using ydata-synthetic, and it really shows! Underrepresentation (category imbalance) and missing data are two of the main issues!
-
SOMEBODY HELP ME!
The Data-Centric AI Community creates community projects from time to time and is probably willing to help you in your project.
-
Help for Data Scientist position
Join nice data communities and start networking.
-
How to become a beast in DS ?
You know what they say: "Tell me who your friends are, and I'll tell you who you are!". Hang out with DS beasts and learn from them :)
-
Hey guys, I have a few questions
Interesting question! I think our AI/ML devs at the Data-Centric AI Community could have nice perspectives for your to decide :)
-
Embarking on a Journey of 99 Data Science Projects - From Beginner to Expert
Sounds like an amazing journey! Feel free to add your projects on our awesome-python-for-data-science repo as you go! And in case you need a hand or feedback on the projects, we'll be happy to help at the Data-Centric AI Community.
-
Data science problems
The best to do is to get started with end-to-end projects in a collaborative environment (somewhat approaching real-world settings). You may find some interesting resources in this GitHub repository. The Data-Centric AI Community actually has a nice support system for this.
What are some alternatives?
machine_learning_complete - A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
REaLTabFormer - A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
walkalongs - Resources and solutions of various technologies that I am currently learning
Copulas - A library to model multivariate data using copulas.
DataScienceProjects
DeepRL-TensorFlow2 - 🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Portfolio
Conditional-Sig-Wasserstein-GANs
fullnamematchscore-go - Generates a match score of two person names from 0-100, where 100 is the highest, on how closely two individual full names match. The scoring is based on a series of tests, algorithms, AI, and an ever-growing body of Machine Learning-based generated knowledge
pytorch-forecasting - Time series forecasting with PyTorch
awesome-generative-ai-companies - A curated list of Gеnerative AI companies, sorted by focus area and total fundraised amount.
gretel-python-client - The Gretel Python Client allows you to interact with the Gretel REST API.