tmu
catboost
tmu | catboost | |
---|---|---|
5 | 8 | |
109 | 7,753 | |
2.8% | 0.8% | |
9.2 | 9.9 | |
about 1 month ago | 5 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tmu
- Tsetlin machine – the other AI toolbooks
- Tsetlin Machine Unified (TMU) - One Codebase to Rule Them All
-
[R] New Tsetlin machine learning scheme creates up to 80x smaller logical rules, benefitting hardware efficiency and interpretability.
Code: https://github.com/cair/tmu
-
This Artificial Intelligence (AI) Research From Norway Introduces Tsetlin Machine-Based Autoencoder For Representing Words Using Logical Expressions
Quick Read: https://www.marktechpost.com/2023/01/10/this-artificial-intelligence-ai-research-from-norway-introduces-tsetlin-machine-based-autoencoder-for-representing-words-using-logical-expressions/ Paper: https://arxiv.org/pdf/2301.00709.pdf Github: https://github.com/cair/tmu
-
Do we really need 300 floats to represent the meaning of a word? Representing words with words - a logical approach to word embedding using a self-supervised Tsetlin Machine Autoencoder.
Here is a new self-supervised machine learning approach that captures word meaning with concise logical expressions. The logical expressions consist of contextual words like “black,” “cup,” and “hot” to define other words like “coffee,” thus being human-understandable. I raise the question in the heading because our logical embedding performs competitively on several intrinsic and extrinsic benchmarks, matching pre-trained GLoVe embeddings on six downstream classification tasks. Thanks to my clever PhD student Bimal, we now have even more fun and exciting research ahead of us. Our long term research goal is, of course, to provide an energy efficient and transparent alternative to deep learning. You find the paper here: https://arxiv.org/abs/2301.00709 , an implementation of the Tsetlin Machine Autoencoder here: https://github.com/cair/tmu, and a simple word embedding demo here: https://github.com/cair/tmu/blob/main/examples/IMDbAutoEncoderDemo.py.
catboost
- CatBoost: Open-source gradient boosting library
- Boosting Algorithms
-
What's New with AWS: Amazon SageMaker built-in algorithms now provides four new Tabular Data Modeling Algorithms
CatBoost is another popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
-
Writing the fastest GBDT libary in Rust
Here are our benchmarks on training time comparing Tangram's Gradient Boosted Decision Tree Library to LightGBM, XGBoost, CatBoost, and sklearn.
-
Data Science toolset summary from 2021
Catboost - CatBoost is an open-source software library developed by Yandex. It provides a gradient boosting framework which attempts to solve for Categorical features using a permutation driven alternative compared to the classical algorithm. Link - https://catboost.ai/
-
CatBoost Quickstart — ML Classification
CatBoost is an open source algorithm based on gradient boosted decision trees. It supports numerical, categorical and text features. Check out the docs.
-
[D] What are your favorite Random Forest implementations that support categoricals
If you considering GBDT check out catboost, unfortunately RF mode is not available but library implement lots of interesting categorical encoding tricks that boost accuracy.
-
CatBoost and Water Pumps
The data contains a large number of categorical features. The most suitable for obtaining a base-line model, in my opinion, is CatBoost. It is a high-performance, open-source library for gradient boosting on decision trees.
What are some alternatives?
nvitop - An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
chainer - A flexible framework of neural networks for deep learning
Recommender - A C library for product recommendations/suggestions using collaborative filtering (CF)
scikit-cuda - Python interface to GPU-powered libraries
Keras - Deep Learning for humans
PyCUDA - CUDA integration for Python, plus shiny features
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
pyopencl - OpenCL integration for Python, plus shiny features
vowpal_wabbit - Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
TsetlinMachine - Code and datasets for the Tsetlin Machine
mxnet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more