Our great sponsors
-
Empirical_Study_of_Ensemble_Learning_Methods
Training ensemble machine learning classifiers, with flexible templates for repeated cross-validation and parameter tuning
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
mljar-supervised
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
I've actually made the same kind of graph before. In this image: each point is the average of 5 out-of-fold predictions for one trial of k-fold cross-validation. I repeated the procedure 40 times to visualize the out-of-fold accuracy on the Wisconsin diagnostic breast cancer data set (560 observations on 30 numeric variables). I evaluated 14 models for classification:
Ah they haven't quite gotten around to supporting multiclass classification yet! https://github.com/dswah/pyGAM/pull/213
What machine have you used for comparison? I would like to check the performance of AutoML that I'm working on.
I don't know anything about scikit-optimize. Optuna doesn't have less constrained parameters like normal/log-normal, useful when approaching a new problem. It also doesn't implement the constant liar algorithm for TPE. The latter is easy to fix, the former can be worked around if you carefully observe the ranges of good parameters and do a re-run or two.