ServiceTalk
hummingbird
Our great sponsors
ServiceTalk | hummingbird | |
---|---|---|
4 | 9 | |
887 | 3,287 | |
2.0% | 0.5% | |
9.5 | 7.3 | |
1 day ago | 20 days ago | |
Java | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ServiceTalk
hummingbird
- Treebomination: Convert a scikit-learn decision tree into a Keras model
-
[D] GPU-enabled scikit-learn
If are interested in just predictions you can try Hummingbird. It is part of the PyTorch ecosystem. We get already trained scikit-learn models and translate them into PyTorch models. From them you can run your model on any hardware support by PyTorch, export it into TVM, ONNX, etc. Performance on hardware acceleration is quite good (orders of magnitude better than scikit-learn is some cases)
Most non-deep ML techniques aren't built on a crapload of matmuladd operations, which is what GPUs are good at and why we use them for DL. So relatively few components of sklearn would benefit from it and I'd be deeply surprised if those parts weren't already implemented for accelerators in other libraries (or transformable via hummingbird). Contributing to those projects would be more valuable than another reimplementation, lest you fall into the 15 standards problem
-
Machine Learning with PyTorch and Scikit-Learn – The *New* Python ML Book
I think Rapids AI's cuML tried to go into this direction (essentially scikit-learn on the GPU): https://docs.rapids.ai/api/cuml/stable/api.html#logistic-reg.... For some reason it never took really off though.
Btw., going on a tangent, you might like Hummingbird (https://github.com/microsoft/hummingbird). It allows you trained scikit-learn tree-based models to PyTorch. I watched the SciPy talk last year, and it's a super smart & elegant idea.
-
Export and run models with ONNX
ONNX opens an avenue for direct inference using a number of languages and platforms. For example, a model could be run directly on Android to limit data sent to a third party service. ONNX is an exciting development with a lot of promise. Microsoft has also released Hummingbird which enables exporting traditional models (sklearn, decision trees, logistical regression..) to ONNX.
-
Supreme Court, in a 6–2 ruling in Google v. Oracle, concludes that Google’s use of Java API was a fair use of that material
And Python.
-
[D] Here are 3 ways to Speed Up Scikit-Learn - Any suggestions?
For inference, you can convert your models to other formats that support GPU acceleration. See Hummingbird https://github.com/microsoft/hummingbird
What are some alternatives?
onnx - Open standard for machine learning interoperability
AkkaGRPC - Akka gRPC
swift - The Swift Programming Language
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
commons-networking - commons networking utils
TLS Channel - A Java library that implements a ByteChannel interface over SSLEngine, enabling easy-to-use (socket-like) TLS for Java applications.
cuml - cuML - RAPIDS Machine Learning Library
Drift - An annotation-based Java library for creating Thrift serializable types and services.
rsocket-java - Java implementation of RSocket
chemprop - Message Passing Neural Networks for Molecule Property Prediction
tune-sklearn - A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
docker - Docker - the open-source application container engine