|3 days ago||27 days ago|
|BSD 3-clause "New" or "Revised" License||MIT License|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Data Science toolset summary from 2021
13 projects | dev.to | 13 Nov 2021
Scikit-learn - It is one of the most widely used frameworks for Python based Data science tasks. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Link - https://scikit-learn.org/
Intel Extension for Scikit-Learn
4 projects | news.ycombinator.com | 1 Nov 2021
Currently some works is being done to improve computational primitives of scikit-learn to enhance its overhaul performances natively.
You can have a look at this exploratory PR: https://github.com/scikit-learn/scikit-learn/pull/20254
This other PR is a clear revamp of this previous one:
Scikit-Learn Version 1.0
11 projects | news.ycombinator.com | 14 Sep 2021
Just to clarify, scikit-learn 1.0 has not been released yet. The latest tag in the github repo is 1.0.rc2
Top 10 Python Libraries for Machine Learning
14 projects | dev.to | 9 Sep 2021
Website: https://scikit-learn.org/ Github Repository: https://github.com/scikit-learn/scikit-learn Developed By: SkLearn.org Primary Purpose: Predictive Data Analysis and Data Modeling
where is binary_metric function in sklearn package
1 project | reddit.com/r/learnmachinelearning | 20 Aug 2021
There is a function named binary_metric in https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/metrics/_base.py
Use Scikit-Learn and Runflow
2 projects | dev.to | 6 Jul 2021
If you're not familiar with Scikit-learn and Runflow,
Confused as to what exaclty a piece of code does
1 project | reddit.com/r/learnmachinelearning | 18 Jun 2021
well you can start at https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/model_selection/_validation.py, or maybe someone will guide you later
What Makes Python Libraries So Important For Data Science Learning?
3 projects | reddit.com/r/u_Snoo36930 | 16 Jun 2021
Next comes the complexity of drawing the maximum possible number of valuable insights. Using different python libraries such as Scikit-Learn, PyTorch, Pandas, etc., complications of data analysis can be solved within a minute. And the complexity associated with visualisation gets handled by other data visualisation libraries like Matploitlib, PyTorch, etc.
Is there a way to map cluster centers back to a dataframe?
1 project | reddit.com/r/learnpython | 19 May 2021
To avoid the issue with convergence (and the discrepancy between the labels_ and cluster_centers_), you can set tol=0, though this can of course lead to issues if convergence is a problem. There was an issue about it here. Assuming it's converged, then the order is fine.
Any from scratch Hamming Loss implementations?
1 project | reddit.com/r/LearnML | 10 May 2021
The source code for the function you refer to is quite straightforward anyway. The definition of count_nonzero() is here.
prophet: NEW Data - star count:13766.0
1 project | reddit.com/r/algoprojects | 5 Dec 2021
Based in London, ML Masters Courses Should I Take? Is There Any Value?
1 project | reddit.com/r/MLQuestions | 3 Dec 2021
ML is stats. It boils down to which field you want to work in - if you want to work with structured or unstructured data. A masters in ML would come in handy if you want to work with startups/small companies on unstructured data (text, images etc.). Most large companies employ stats based methods to predict future trends (time series data: ARIMA, prophet (like in Zillow) etc.), perform A/B testing etc. so they prefer that you have quant skills. You can look at the job requirements for a Data Scientist at Google here and their interview questions here. FAANG recruits PhDs as research scientists for their ML work using unstructured data.
prophet: NEW Data - star count:13699.0
1 project | reddit.com/r/algoprojects | 26 Nov 20211 project | reddit.com/r/algoprojects | 25 Nov 20211 project | reddit.com/r/algoprojects | 24 Nov 20211 project | reddit.com/r/algoprojects | 23 Nov 20211 project | reddit.com/r/algoprojects | 22 Nov 20211 project | reddit.com/r/algoprojects | 21 Nov 20211 project | reddit.com/r/algoprojects | 20 Nov 2021
prophet: NEW Data - star count:13664.0
1 project | reddit.com/r/algoprojects | 19 Nov 2021
What are some alternatives?
Keras - Deep Learning for humans
tensorflow - An Open Source Machine Learning Framework for Everyone
Surprise - A Python scikit for building and analyzing recommender systems
xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
gensim - Topic Modelling for Humans
greykite - A flexible, intuitive and fast forecasting library
MLflow - Open source platform for the machine learning lifecycle
darts - A python library for easy manipulation and forecasting of time series.
sktime - A unified framework for machine learning with time series