ISL-python
ydata-profiling
ISL-python | ydata-profiling | |
---|---|---|
4 | 43 | |
181 | 12,053 | |
- | 0.9% | |
0.0 | 8.5 | |
over 1 year ago | 11 days ago | |
Jupyter Notebook | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ISL-python
-
Andrew Ng's Machine Learning Specialization or Introduction to Statistical Learning? For someone who's comfortable with mathematics.
https://github.com/emredjan/ISL-python this GitHub has the exercises in python but I am so pumped the python version is coming out this summer.
- Hey I wanna learn Statistics with python can anyone suggest me a good book and a good YouTube tutorial because i am really poor at it I don't know the basic concepts about it
-
ESL vs ISLR books?
Here or here for the Python versions of ISLR.
-
The Hundred-Page Machine Learning Book
I typically recommend a few different books to everyone who finishes the bootcamp, based on a self-assessment they take. I recommend some books based on their strengths, so they can find a career path sooner, and some books based on their weaknesses, so they can widen their cone of oppportunity within ML.
In our consultancy, data science is done in Python and SQL (and PySpark, but I don't hand out books on that during bootcamp!), and ML delivery is a combination of math, software engineering, and architecture/product owner disciplines.
If you're strong in software engineering, I recommend Machine Learning Mastery with Python by Jason Brownlee as it's very hands-on in Python and helps you run code to "see" how ML works.
If you're weak in software engineering and Python, I recommend A Whirlwind Tour Of Python by Jake VanderPlas, and its companion book Python Data Science Handbook.
If you're strong in architecting / product management, I recommend Building Machine Learning Powered Applications by Emmanuel Ameisen since it explains it more from an SDLC perspective, including things like scoping, design, development, testing, general software engineering best practices, collaboration, etc.
If you're weak in architecting / product management, I typically recommend User Story Mapping by Jeff Patton and Making Things Happen by Scott Berkun, which are both excellent how-tos with great examples to build on.
If you're strong in math, I recommend Understanding Machine Learning from Theory to Algorithm by Shalev-Shwartz and Ben-David, as it has all the mathematics for ML and actually has some pseudocode for implementation which helps bridge the gap into actual software development (the book's title is very accurate!)
For someone who is weak in the math of ML, I recommend Introduction to Statistical Learning by Hastie et al (along with the Python port of the code https://github.com/emredjan/ISL-python ) which I think does just enough hand holding to move someone from "did high school math 20 years ago" to "I understand what these hyperparameters are optimizing for."
Anyway, I've spent a lot of time reading and reviewing books about ML, and my key takeaway is ones that get you closer to writing actual code to solving actual problems for actual people are the ones to focus on.
ydata-profiling
- FLaNK 25 December 2023
-
First 15 Open Source Advent projects
6. Ydata-synthetic and Ydata-profiling by YData | Github | tutorial
-
Coding Wonderland: Contribute to YData Profiling and YData Synthetic in this Advent of Code
Send us your North ⭐️: "On the first day of Christmas, my true contributor gave to me..." a star in my GitHub tree! 🎵 If you love these projects too, star ydata-profiling or ydata-synthetic and let your friends know why you love it so much!
- Data exploration is not dead
- Explore your data in a single line of code
-
Which preprocessing steps to improve the performance of a naive bayes classifier
My suggestion start with the EDA - there are a lot of packages that automate that for you already. My usual go-to: https://github.com/ydataai/ydata-profiling.
-
Simulating sales data
If you're not sure about the behaviour of your data (i.e., if the original data has properties like seasonality), you can use ydata-profiling to profile your data first.
-
I recorded a Data Science Project using Python and uploaded it on Youtube
Super cool! For EDA, you could give ydata-profiling a spin sometime and speed up the process!
-
Ydata-Profiling and Dask
Hey guys,
We've been recently at the Dask Demo Day and we're hoping to launch a new feature on ydata-profiling, with the support for Dask dataframes!
We're looking for Dask Wizards to start collaborating on this feature, so if you're interested, please join us to define the roadmap of the project and start making it real
Current GitHub branch is here: https://github.com/ydataai/ydata-profiling/tree/feat/dask
Dedicated dask channel here: https://discord.gg/EHDBuSSDuy
-
🧠 ydata-profiling + Dask!
We're looking for Dask Wizards 🧙🏻♂️ to start collaborating on this branch, so if you're interested, please join us to define the roadmap of the project and start making it real 🚀
What are some alternatives?
ISLR-python - An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani, 2013): Python code
dtale - Visualizer for pandas data structures
the-elements-of-statistical-learning - My notes and codes (jupyter notebooks) for the "The Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani and Jerome Friedman
DataProfiler - What's in your data? Extract schema, statistics and entities from datasets
paip-lisp - Lisp code for the textbook "Paradigms of Artificial Intelligence Programming"
dataframe-go - DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
fecon235 - Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio equities SPX bonds TIPS rates currency FX euro EUR USD JPY yen XAU gold Brent WTI oil Holt-Winters time-series forecasting statistics econometrics
lux - Automatically visualize your pandas dataframe via a single print! 📊 💡
ML-foundations - Machine Learning Foundations: Linear Algebra, Calculus, Statistics & Computer Science
get-started-with-JAX - The purpose of this repo is to make it easy to get started with JAX, Flax, and Haiku. It contains my "Machine Learning with JAX" series of tutorials (YouTube videos and Jupyter Notebooks) as well as the content I found useful while learning about the JAX ecosystem.
evidently - Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
dataprep - Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.