catboost
guildai
catboost | guildai | |
---|---|---|
8 | 16 | |
7,776 | 859 | |
1.1% | 0.5% | |
9.9 | 8.8 | |
5 days ago | 9 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
catboost
- CatBoost: Open-source gradient boosting library
- Boosting Algorithms
-
What's New with AWS: Amazon SageMaker built-in algorithms now provides four new Tabular Data Modeling Algorithms
CatBoost is another popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
-
Writing the fastest GBDT libary in Rust
Here are our benchmarks on training time comparing Tangram's Gradient Boosted Decision Tree Library to LightGBM, XGBoost, CatBoost, and sklearn.
-
Data Science toolset summary from 2021
Catboost - CatBoost is an open-source software library developed by Yandex. It provides a gradient boosting framework which attempts to solve for Categorical features using a permutation driven alternative compared to the classical algorithm. Link - https://catboost.ai/
-
CatBoost Quickstart — ML Classification
CatBoost is an open source algorithm based on gradient boosted decision trees. It supports numerical, categorical and text features. Check out the docs.
-
[D] What are your favorite Random Forest implementations that support categoricals
If you considering GBDT check out catboost, unfortunately RF mode is not available but library implement lots of interesting categorical encoding tricks that boost accuracy.
-
CatBoost and Water Pumps
The data contains a large number of categorical features. The most suitable for obtaining a base-line model, in my opinion, is CatBoost. It is a high-performance, open-source library for gradient boosting on decision trees.
guildai
-
guildai VS cascade - a user suggested alternative
2 projects | 5 Dec 2023
-
[D] Who here are convinced that they have a really good setup that keeps track of their ML experiments?
Experiment tracking in DvC is implemented using git to store snapshots of a project and related artifacts. You might take a look at Guild AI's support for DvC, which is tightly integrated with DvC stages. You can run any of the stages defined for a project and you get a properly isolated run (each run is a project copy to ensure that you're not corrupting the run if you modify files while it's running - as well as properly supporting concurrent runs). Once you have runs in Guild, you can use any number of tools to study, compare, export, etc.
-
[D] Deploying SOTA models into my own projects
I built an experiment tracking tool (Guild AI) that focuses on code/model reuse and so this question is dear to my heart :) Best of luck!
-
[P] I reviewed 50+ open-source MLOps tools. Here’s the result
I'm not aware of experiment tracking in Jupyter notebooks themselves. Guild AI is able to run notebooks as experiments however.
-
[D] What MLOps platform do you use, and how helpful are they?
Disclosure - I'm the author of Guild AI so take this for the biased opinion that it is.
-
[N] Experiment tracking with DvC and Guild AI
I'm the author of Guild AI (open source experiment tracking). For some time now Guild users have asked for DvC support. This is now available as a pre-release.
-
[D] Why doesn’t your team use an experiment tracking tool?
Guild AI now has support for running DvC stages as experiments. DvC uses git under the covers to manage project state for each experiment, along with the experiment results. Guild doesn't touch your git repo and instead copies your project source to a new run directory. This ensures that you have a correct record of your experiment without churning your project state.
-
Data Science toolset summary from 2021
Guild.ai - https://guild.ai/
- [D] How do you ensure reproducibility?
-
[D] I'm new and scrappy. What tips do you have for better logging and documentation when training or hyperparameter training?
Use guild and pytorch-lightning. Make it easy for new contributors to get your data by using dvc as a data access tool.
What are some alternatives?
xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
MLflow - Open source platform for the machine learning lifecycle
Recommender - A C library for product recommendations/suggestions using collaborative filtering (CF)
aim - Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
Keras - Deep Learning for humans
dvc - 🦉 ML Experiments and Data Management with Git
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
pytorch-lightning - Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning]
vowpal_wabbit - Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
labml - 🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
mxnet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
wandb - 🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.