Spearmint
srbench
Our great sponsors
Spearmint | srbench | |
---|---|---|
2 | 2 | |
1,529 | 192 | |
0.1% | 4.2% | |
0.0 | 9.1 | |
over 4 years ago | 3 months ago | |
Python | Python | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Spearmint
-
Why do tree-based models still outperform deep learning on tabular data?
It occurs to me that a system, trained on peer-reviewed applied-machine-learning literature and Kaggle winners, that generates candidates for structured feature-engineering specifications, based on plaintext descriptions of columns' real-world meaning, should be considered a requisite part of the "meta" here.
Ah, and then you could iterate within the resulting feature-engineering-suggestion space as a hyper-parameter between experiments, which could be optimized with e.g. https://github.com/HIPS/Spearmint . The papers write themselves!
-
[D] What kind of Hyperparameter Optimisation do you use?
This was some time ago but I had some promising results with Bayesian optimization using a Gaussian Process prior. The method was developed by the guys who wrote Spearmint. That library doesn't support parallelization but I implemented the same technique in Scala without too much difficulty.
srbench
-
Ask HN: Is genetic programming still actively researched?
NEAT and neuroevolution in general are interesting approaches. I also suggest to check techniques like DENSER [1] that can be used to evolve deep networks (by using the evolutionary part on the network structure and not on the weights).
Genetic Programming (GP), however, has not evolved to NEAT (which itself is not very recent, being published in 2002) but simply neuroevolution has become one of the topics that are part of evolutionary computation (EC). For example, one of the largest yearly conferences on evolutionary computation (GECCO) [2] was just last month with both neuroevolution and GP tracks. It is however true that the success of neural techniques had an effect on the community, some effects are the discussion of the role of EC and, for example, more space given to hybrid works (see, for example, the joint track on evolutionary machine learning [3] inside the evostar event).
Related to the original post, a place where some recent research on GP can be found are the proceedings of GECCO (GP track), EuroGP (part of evostar), PPSN (Parallel Problem Solving from Nature), and IEEE CEC (IEEE Congress on Evolutionary Computation) and journals like Genetic Programming and Evolvable Machine (GPEM), Swarm and Evolutionary Computation (SWEVO), and IEEE Transactions on Evolutionary Computation (IEEE TEVC). The list is not exhaustive, but those are some well-known venues.
For a less "daunting" starting point, some recent techniques are being added to the SRBench benchmark suite [4], with links to both the code and the paper describing the technique.
[1] Assunção, F., Lourenço, N., Machado, P., & Ribeiro, B. (2019, March). Fast denser: Efficient deep neuroevolution. In european conference on genetic programming (pp. 197-212). Cham: Springer International Publishing.
[2] https://gecco-2023.sigevo.org/HomePage
[3] https://www.evostar.org/2024/eml/
[4] https://github.com/cavalab/srbench
-
Why do tree-based models still outperform deep learning on tabular data?
A great paper and an important result.
However, it omits to cite the highly relevant SRBench paper from 2021, which also carefully curates a suitable set of regression benchmarks and shows that Genetic Programming approaches also tend to be better than deep learning.
https://github.com/cavalab/srbench
cc u/optimalsolver
What are some alternatives?
optuna - A hyperparameter optimization framework
yggdrasil-decision-forests - A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
decision-forests - A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
axe-testcafe - The helper for using Axe in TestCafe tests
youtube-react - A Youtube clone built in React, Redux, Redux-saga
spaceopt - Hyperparameter optimization via gradient boosting regression
higgs-logistic-regression