decision-forests
srbench
decision-forests | srbench | |
---|---|---|
1 | 2 | |
651 | 194 | |
0.8% | 2.1% | |
8.3 | 9.1 | |
9 days ago | 3 months ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
decision-forests
-
Why do tree-based models still outperform deep learning on tabular data?
I can't explain it, but I help maintain TensorFlow Decision Forests [1] and Yggdrasil Decision Forests [2], and in an AutoML system at work that trains models on lots of various users data, decision forest models gets selected as best (after AutoML tries various model types and hyperparameters) somewhere between 20% to 40% of the times, systematically. It's pretty interesting. Other ML types considered are NN, linear models (with auto feature crossings generation), and a couple of other variations.
[1] https://github.com/tensorflow/decision-forests
srbench
-
Ask HN: Is genetic programming still actively researched?
NEAT and neuroevolution in general are interesting approaches. I also suggest to check techniques like DENSER [1] that can be used to evolve deep networks (by using the evolutionary part on the network structure and not on the weights).
Genetic Programming (GP), however, has not evolved to NEAT (which itself is not very recent, being published in 2002) but simply neuroevolution has become one of the topics that are part of evolutionary computation (EC). For example, one of the largest yearly conferences on evolutionary computation (GECCO) [2] was just last month with both neuroevolution and GP tracks. It is however true that the success of neural techniques had an effect on the community, some effects are the discussion of the role of EC and, for example, more space given to hybrid works (see, for example, the joint track on evolutionary machine learning [3] inside the evostar event).
Related to the original post, a place where some recent research on GP can be found are the proceedings of GECCO (GP track), EuroGP (part of evostar), PPSN (Parallel Problem Solving from Nature), and IEEE CEC (IEEE Congress on Evolutionary Computation) and journals like Genetic Programming and Evolvable Machine (GPEM), Swarm and Evolutionary Computation (SWEVO), and IEEE Transactions on Evolutionary Computation (IEEE TEVC). The list is not exhaustive, but those are some well-known venues.
For a less "daunting" starting point, some recent techniques are being added to the SRBench benchmark suite [4], with links to both the code and the paper describing the technique.
[1] Assunção, F., Lourenço, N., Machado, P., & Ribeiro, B. (2019, March). Fast denser: Efficient deep neuroevolution. In european conference on genetic programming (pp. 197-212). Cham: Springer International Publishing.
[2] https://gecco-2023.sigevo.org/HomePage
[3] https://www.evostar.org/2024/eml/
[4] https://github.com/cavalab/srbench
-
Why do tree-based models still outperform deep learning on tabular data?
A great paper and an important result.
However, it omits to cite the highly relevant SRBench paper from 2021, which also carefully curates a suitable set of regression benchmarks and shows that Genetic Programming approaches also tend to be better than deep learning.
https://github.com/cavalab/srbench
cc u/optimalsolver
What are some alternatives?
Spearmint - Spearmint Bayesian optimization codebase
yggdrasil-decision-forests - A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
higgs-logistic-regression