catboost vs LightGBM

catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU. (by catboost)

Source Code

catboost.ai

Docs

Suggest alternative

Edit details

LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. (by Microsoft)

Gbdt Gbm Machine Learning Data Mining Distributed Lightgbm gbrt Microsoft decision-trees gradient-boosting Python R Parallel Kaggle

Source Code

lightgbm.readthedocs.io

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

catboost		LightGBM
	Project
8	Mentions	11
7,776	Stars	16,126
1.1%	Growth	1.0%
9.9	Activity	9.1
5 days ago	Latest Commit	2 days ago
Python	Language	C++
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

catboost

Posts with mentions or reviews of catboost. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-07-05.

CatBoost: Open-source gradient boosting library
1 project | news.ycombinator.com | 5 Mar 2024
Boosting Algorithms
2 projects | dev.to | 5 Jul 2022
What's New with AWS: Amazon SageMaker built-in algorithms now provides four new Tabular Data Modeling Algorithms
3 projects | dev.to | 28 Jun 2022

CatBoost is another popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
Writing the fastest GBDT libary in Rust
6 projects | dev.to | 11 Jan 2022

Here are our benchmarks on training time comparing Tangram's Gradient Boosted Decision Tree Library to LightGBM, XGBoost, CatBoost, and sklearn.
Data Science toolset summary from 2021
13 projects | dev.to | 13 Nov 2021

Catboost - CatBoost is an open-source software library developed by Yandex. It provides a gradient boosting framework which attempts to solve for Categorical features using a permutation driven alternative compared to the classical algorithm. Link - https://catboost.ai/
CatBoost Quickstart — ML Classification
2 projects | dev.to | 15 Mar 2021

CatBoost is an open source algorithm based on gradient boosted decision trees. It supports numerical, categorical and text features. Check out the docs.
[D] What are your favorite Random Forest implementations that support categoricals
2 projects | /r/MachineLearning | 20 Feb 2021

If you considering GBDT check out catboost, unfortunately RF mode is not available but library implement lots of interesting categorical encoding tricks that boost accuracy.
CatBoost and Water Pumps
2 projects | dev.to | 20 Feb 2021

The data contains a large number of categorical features. The most suitable for obtaining a base-line model, in my opinion, is CatBoost. It is a high-performance, open-source library for gradient boosting on decision trees.

LightGBM

Posts with mentions or reviews of LightGBM. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-29.

SIRUS.jl: Interpretable Machine Learning via Rule Extraction
2 projects | /r/Julia | 29 Jun 2023

SIRUS.jl is a pure Julia implementation of the SIRUS algorithm by Bénard et al. (2021). The algorithm is a rule-based machine learning model meaning that it is fully interpretable. The algorithm does this by firstly fitting a random forests and then converting this forest to rules. Furthermore, the algorithm is stable and achieves a predictive performance that is comparable to LightGBM, a state-of-the-art gradient boosting model created by Microsoft. Interpretability, stability, and predictive performance are described in more detail below.
[D] RAM speeds for tabular machine learning algorithms
1 project | /r/MachineLearning | 9 Jun 2023

Hey, thanks everybody for your answers. I've asked around in the XGBoost and LightGBM repos and some folks there also agreed that memory speed will be a bottleneck yes.
[P] LightGBM but lighter in another language?
1 project | /r/MachineLearning | 4 May 2023

LightBGM seems to have C API support, and C++ example in the main repo
Use whatever is best for the problem, but still
1 project | /r/datascience | 9 Aug 2022

LGBM doesn't do RF well, but it's easy to manually bag single LGBM trees.
What's New with AWS: Amazon SageMaker built-in algorithms now provides four new Tabular Data Modeling Algorithms
3 projects | dev.to | 28 Jun 2022

LightGBM is a popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
Search YouTube from the terminal written in python
2 projects | /r/Python | 28 Feb 2022

Microsoft lightGBM. https://github.com/microsoft/LightGBM
LightGBM VS CXXGraph - a user suggested alternative
2 projects | 28 Feb 2022
Writing the fastest GBDT libary in Rust
6 projects | dev.to | 11 Jan 2022

Here are our benchmarks on training time comparing Tangram's Gradient Boosted Decision Tree Library to LightGBM, XGBoost, CatBoost, and sklearn.
Workstation Management With Nix Flakes: Build a Cmake C++ Package
2 projects | dev.to | 31 Oct 2021

{ inputs = { nixpkgs = { url = "github:nixos/nixpkgs/nixos-unstable"; }; flake-utils = { url = "github:numtide/flake-utils"; }; }; outputs = { nixpkgs, flake-utils, ... }: flake-utils.lib.eachDefaultSystem (system: let pkgs = import nixpkgs { inherit system; }; lightgbm-cli = (with pkgs; stdenv.mkDerivation { pname = "lightgbm-cli"; version = "3.3.1"; src = fetchgit { url = "https://github.com/microsoft/LightGBM"; rev = "v3.3.1"; sha256 = "pBrsey0RpxxvlwSKrOJEBQp7Hd9Yzr5w5OdUuyFpgF8="; fetchSubmodules = true; }; nativeBuildInputs = [ clang cmake ]; buildPhase = "make -j $NIX_BUILD_CORES"; installPhase = '' mkdir -p $out/bin mv $TMP/LightGBM/lightgbm $out/bin ''; } ); in rec { defaultApp = flake-utils.lib.mkApp { drv = defaultPackage; }; defaultPackage = lightgbm-cli; devShell = pkgs.mkShell { buildInputs = with pkgs; [ lightgbm-cli ]; }; } ); }
Is it possible to clean memory after using a package that has a memory leak in my python script?
2 projects | /r/Python | 29 Apr 2021

I'm working on the AutoML python package (Github repo). In my package, I'm using many different algorithms. One of the algorithms is LightGBM. The algorithm after the training doesn't release the memory, even if del is called and gc.collect() after. I created the issue on LightGBM GitHub -> link. Because of this leak, memory consumption is growing very fast during algorithm training.

What are some alternatives?

When comparing catboost and LightGBM you can also consider the following projects:

xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

tensorflow - An Open Source Machine Learning Framework for Everyone

Recommender - A C library for product recommendations/suggestions using collaborative filtering (CF)

H2O - H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Keras - Deep Learning for humans

GPBoost - Combining tree-boosting with Gaussian process and mixed effects models

Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

amazon-sagemaker-examples - Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

vowpal_wabbit - Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

yggdrasil-decision-forests - A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

mxnet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

mljar-supervised - Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

catboost vs xgboost LightGBM vs tensorflow catboost vs Recommender LightGBM vs H2O catboost vs Keras LightGBM vs GPBoost catboost vs Prophet LightGBM vs amazon-sagemaker-examples catboost vs vowpal_wabbit LightGBM vs yggdrasil-decision-forests catboost vs mxnet LightGBM vs mljar-supervised

Compare catboost vs LightGBM and see what are their differences.

catboost

LightGBM

catboost

LightGBM

What are some alternatives?