lleaves
miceforest
Our great sponsors
lleaves | miceforest | |
---|---|---|
4 | 6 | |
292 | 308 | |
- | - | |
7.0 | 4.5 | |
27 days ago | 7 days ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lleaves
- LLeaves: A LLVM-based compiler for LightGBM decision trees
-
Cold Showers
I built this decision tree (LightGBM) compiler last summer: https://github.com/siboehm/lleaves
It get's you ~10x speedups for batch predictions, more if your model is big. It's not complicated, it ended up being <1K lines of Python code. I heard a couple of stories like yours, where people had multi-node spark clusters running LightGBM, and it always amused me because by if you compiled the trees instead you could get rid of the whole cluster.
-
Tree compiler that speeds up LightGBM model inference by ~30x
In a near-future version I'll expose some of the compilation parameters, I was somewhat afraid of having an API that's too complicated deterring people who just want a no-fuzz drop-in replacement for LightGBM. But as long as I keep sane defaults and have the parameters optional it should be fine. Relevant parameters are definitely block size (needs to adjust to L1i size and tree size) as well as the LLVM codemodel (a smaller adress space increases single-batch prediction speeds but doesn't work for large models). The thread-size specific compilation I'm still looking into, it makes the API more complicated and so might not be worth it.
miceforest
- Ask HN: What is the most impactful thing you've ever built?
-
Cold Showers
Wow, very interesting, thanks for this. Daily batch predictions is all we do. I’m the maintainer of miceforest[1], do you think this would integrate well into the package at a brief glance? I’m always looking for ways to make this package faster.
[1] https://github.com/AnotherSamWilson/miceforest
- Miceforest: Fast, Memory Efficient, Multiple Imputation by Chained Equations
- Show HN: Multiple Imputation with Lightgbm
-
Multiple Imputation with lightgbm
I am the maintainer of miceforest. I've just released a major update that a lot of you might find useful.
- R markdown README.rmd Equivalent in Python?
What are some alternatives?
mljar-supervised - Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
emlearn - Machine Learning inference engine for Microcontrollers and Embedded devices
ngboost - Natural Gradient Boosting for Probabilistic Prediction
scikit-learn - scikit-learn: machine learning in Python
m2cgen - Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
eap_proxy - Proxy EAP packets between interfaces on Linux devices such as the Ubiquiti Networks EdgeRouter™ and UniFi® Security Gateway.
catboost - A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis
arduino_midi_library - MIDI for Arduino
inaturalist - The Rails app behind iNaturalist.org
beadm - FreeBSD utility to manage Boot Environments on ZFS filesystems.
Video-Hub-App - Official repository for Video Hub App