scikit-learn
Pytorch
Our great sponsors
scikit-learn | Pytorch | |
---|---|---|
64 | 257 | |
53,503 | 64,111 | |
1.5% | 2.9% | |
9.9 | 10.0 | |
1 day ago | 6 days ago | |
Python | C++ | |
BSD 3-clause "New" or "Revised" License | BSD 1-Clause License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scikit-learn
-
We are the developers behind pandas, currently preparing for the 2.0 release :) AMA
There's an issue here about that https://github.com/scikit-learn/scikit-learn/discussions/25450
-
Machine learning with Julia - Solve Titanic competition on Kaggle and deploy trained AI model as a web service
This is not a book, but only an article. That is why it can't cover everything and assumes that you already have some base knowledge to get the most from reading it. It is essential that you are familiar with Python machine learning and understand how to train machine learning models using Numpy, Pandas, SciKit-Learn and Matplotlib Python libraries. Also, I assume that you are familiar with machine learning theory: types of machine learning problems like regression and classification, the concept and process of Supervised machine learning (fit/predict and evaluate quality using metrics) and common models used for it, including Random Forest Classifier, and it's implementation in SciKit-Learn Python library. Additionally, it would be great if you previously participated in Kaggle competitions, because to understand and run all code of this article you need to have an account on https://kaggle.com.
-
Best Websites For Coders
Scikit-learn : A Python module for machine learning build on top of SciPy
-
scikit-learn VS Rath - a user suggested alternative
2 projects | 12 Jan 2023
-
Boston Dataset was Removed from scikit-learn 1.2
Can you really call this "banning the dataset"? https://github.com/scikit-learn/scikit-learn/commit/8a86e219...
- ML Frameworks
-
Machine Learning Pipelines with Spark: Introductory Guide (Part 1)
The concepts are similar to the Scikit-learn project. They follow Spark’s “ease of use” characteristic giving you one more reason for adoption. You will learn more about these main concepts in this guide.
-
How do you programmers make sense of production-level code?
If you look at the README for scikit-learn on GitHub, they say this.
Take a smaller segment to look at. Opening up the front page to a Github repo can be quite daunting. https://github.com/scikit-learn/scikit-learn
Pytorch
-
AI’s compute fragmentation: what matrix multiplication teaches us
My claim is subjective of course, but the idea is that there aren't many distinct kernels used in machine learning. It's all tensor contractions and element-wise operations. I'd argue that this can be maintained by hand without need for automation or high level abstraction.
Triton is used in a templated way for a very specific albeit pervasive hardware (PTX compatible GPUs), which is why it works so well. Here's some of the code: https://github.com/pytorch/pytorch/blob/a66625da3bcdf1e262dd...
Generalized kernel generation (i.e. synthesis of optimal performance from non-expert user defined kernels and novel hardware) would be fantastic to have, but it just doesn't seem particularly necessary in the field.
-
Was just disqualified from a high school web design competition because our submission was too good
If you only care about data science and machine learning, then I would learn scikit-learn and PyTorch. Most companies and research groups have switched from Tensorflow to PyTorch, and Tensorflow itself replaced a number of frameworks before it (e.g., Caffe, Theano). I would also recommend reading An Introduction to Statistical Learning to get a basic understanding of different methods.
-
[D] PyTorch 2.0 Native Flash Attention 32k Context Window
You might look into https://github.com/pytorch/pytorch/pull/95793.
-
Torch 2.0 just went GA in the last day.
When you said "build" pytorch I thought you meant(simplified): git clone https://github.com/pytorch/pytorch # get the source code
-
PyTorch 2.0 Release
This is the master tracking list for MPS operator support: https://github.com/pytorch/pytorch/issues/77764
-
Apple Mac M1/M2 Pygmalion Support for oobabooga
There is also some hope of things using the GPU on the M1/M2 as well. I did some testing and actually got it hooked up with some caveats. Not all PyTorch functions are mapped to work properly in the new MPS functionality Apple has provided so far. It looks like both PyTorch and Apple are working on things so this will improve. It also seems that the memory requirements of loading the models with GPU functionality are crazy high. That could be a side effect of the prototyping I did, but not sure. If you're interested, more detail can be found here.
-
Accelerating AI inference?
Pytorch supports other kinds of accelerators (e.g. FPGA, and https://github.com/pytorch/glow), but unless you want to become a ML systems engineer and have money and time to throw away, or a business case to fund it, it is not worth it. In general, both pytorch and tensorflow have hardware abstractions that will compile down to device code. (XLA, https://github.com/pytorch/xla, https://github.com/pytorch/glow). TPUs and GPUs have very different strengths; so getting top performance requires a lot of manual optimizations. Considering the the cost of training LLM, it is time well spent.
- Nope, idk.
-
Zero-Shot Image-to-Image Translation
While your millage (clearly) varies from mine, Anaconda is a de facto standard way to go in deep learning (and, generally, in most of the Python data science ecosystem).
For example, when you go to the front page of PyTorch (https://pytorch.org/), the default way to go is with Anaconda. It precisely makes it easy to install things regardless of the system and with matching versions. For example, out of box, it gives GPU support for Apple Silicon - not extra installation instructions.
Pip installers don't work with non-Python dependencies. Of course, you can manually install things any way you like (including inside Docker), but it is up to you to make sure that all dependencies are compatible. And it is a non-trivial task, given frequent updates of all things involved (including CUDA kernels, Python versions, PyTorch/TF versions, and all libraries related to them one way or the other).
What are some alternatives?
Flux.jl - Relax! Flux is the ML library that doesn't make you tensor
Keras - Deep Learning for humans
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Surprise - A Python scikit for building and analyzing recommender systems
mediapipe - Cross-platform, customizable ML solutions for live and streaming media.
tensorflow - An Open Source Machine Learning Framework for Everyone
Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing
flax - Flax is a neural network library for JAX that is designed for flexibility.
gensim - Topic Modelling for Humans
H2O - H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more