automlbenchmark
Ray
Our great sponsors
automlbenchmark | Ray | |
---|---|---|
3 | 42 | |
380 | 31,101 | |
3.4% | 3.4% | |
6.7 | 10.0 | |
6 days ago | 4 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
automlbenchmark
-
Show HN: Web App with GUI for AutoML on Tabular Data
Here is benchmark done by independent team of researchers https://openml.github.io/automlbenchmark/
I think most of overfitting is avoided with early stoppoing technique.
The underfitting can be avoidwd with using large training time.
-
Show HN: AutoAI
Your list excludes most of well-known open-source AutoML tools such as auto-sklearn, AutoGluon, LightAutoML, MLJarSupervised, etc. These tools have been very extensively benchmarked by the OpenML AutoML Benchmark (https://github.com/openml/automlbenchmark) and have papers published, so they are pretty well-known to the AutoML community.
Regarding H2O.ai: Frankly, you don't seem to understand H2O.ai's AutoML offerings.
I'm the creator of H2O AutoML, which is open source, and there's no "enterprise version" of H2O AutoML. The interface is simple -- all you need to specify is the training data and target. We have included DNNs in our set of models since the first release of the tool in 2017. Read more here: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html We also offer full explainability for our models: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/explain.html
H2O.ai develops another AutoML tool called Driverless AI, which is proprietary. You might be conflating the two. Neither of these tools need to be used on the H2O AI Cloud. Both tools pre-date our cloud by many years and can be used on a user's own laptop/server very easily.
Your Features & Roadmap list in the README indicates that your tool does not yet offer DNNs, so either you should update your post here or update your README if it's incorrect: https://github.com/blobcity/autoai/blob/main/README.md#featu...
Lastly, I thought I would mention that there's already an AutoML tool called "AutoAI" by IBM. Generally, it's not a good idea to have name collisions in a small space like the AutoML community. https://www.ibm.com/support/producthub/icpdata/docs/content/...
-
Show HN: Mljar Automated Machine Learning for Tabular Data (Explanation,AutoDoc)
I'm also curious how does it compare! The package will be included in the newest comparison done by OpenML people https://github.com/openml/automlbenchmark
I have some old comparison of closed-source old system
Ray
-
Open Source Advent Fun Wraps Up!
22. Ray | Github | tutorial
-
Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models
Training times for GSM8k are mentioned here: https://github.com/ray-project/ray/tree/master/doc/source/te...
- Ray – an open source project for scaling AI workloads
-
Methods to keep agents inside grid world.
Here's a reference from RLlib that points to docs and an example, and here's one from one of my projects that includes all my own implementations
-
TransformerXL + PPO Baseline + MemoryGym
RLlib
- Is dynamic action masking possible in Rllib?
-
AWS re:Invent 2022 Recap | Data & Analytics services
⦿ AWS Glue Data Quality - Automatic data quality rule recommendations based on your data AWS Glue for Ray - Data integration with Ray (ray.io), a popular new open-source compute framework that helps you scale Python workloads
-
Think about it for a second
https://ray.io (just dropping the link)
-
Elixir Livebook now as a desktop app
I've wondered whether it's easier to add data analyst stuff to Elixir that Python seems to have, or add features to Python that Erlang (and by extension Elixir) provides out of the box.
By what I can see, if you want multiprocessing on Python in an easier way (let's say running async), you have to use something like ray core[0], then if you want multiple machines you need redis(?). Elixir/Erlang supports this out of the box.
Explorer[1] is an interesting approach, where it uses Rust via Rustler (Elixir library to call Rust code) and uses Polars as its dataframe library. I think Rustler needs to be reworked for this usecase, as it can be slow to return data. I made initial improvements which drastically improves encoding (https://github.com/elixir-nx/explorer/pull/282 and https://github.com/elixir-nx/explorer/pull/286, tldr 20+ seconds down to 3).
[0] https://github.com/ray-project/ray
-
Learn various techniques to reduce data processing time by using multiprocessing, joblib, and tqdm concurrent
Adding these for anyone who had a similar question about Ray vs dask 1, 2, 3
What are some alternatives?
autogluon - Fast and Accurate ML in 3 Lines of Code
optuna - A hyperparameter optimization framework
nni - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
autokeras - AutoML library for deep learning
Faust - Python Stream Processing
MindsDB - The platform for customizing AI from enterprise data
gevent - Coroutine-based concurrency library for Python
adanet - Fast and flexible AutoML with learning guarantees.
stable-baselines - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
mljar-supervised - Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
SCOOP (Scalable COncurrent Operations in Python) - SCOOP (Scalable COncurrent Operations in Python)