deodel
dgl
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
deodel
- [P] New predictor does classification intermixed with regression
- Easy Machine Learning Dataset Evaluation Tool (Update)
-
What are some practical tips for efficiently handling missing or null values in datasets during data analysis in Python?
You could use this new classifier deodel that is very robust. It deals seamlessly with missing data, nulls, mixed numerical and categorical attributes, and multi-class targets. You can see an application with this tool:
-
What’s your approach to highly imbalanced data sets?
Just to mention that there is also a new algorithm that is immune to the imbalance of data. An implementation in python is available at: - https://github.com/c4pub/deodel
- Robust mixed attributes classifier (machine learning)
-
[P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions.
The deodel classifier can act as a quick dataset evaluation tool. If your data is available in table format, you can check its potential for prediction/classification. Just feed it to deodel. It accepts mixed attributes without any preliminary curation. It simply considers attribute values expressed as floats (dot decimal) as being continuous. It accepts even a mix of continuous and categorical values for the same attribute column.
- [D] Open-source package to mix numerical, categorical and text features?
- [P] Discretization: equal-width trumps equal-frequency?
- [P] Discretization: equal-width beats equal-frequency?
dgl
-
[P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions.
For graph embeddings, there's quite a few. I'd recommend this one, but there's also this one (disclaimer: I'm the author) or this one, more of a DGL library.
-
Detecting Out-of-Distribution Datapoints via Embeddings or Predictions
For trees/graphs, you’ll want a neural net that can take these as inputs for which I’m not sure a standard library exists. One recommendation is to checkout dgl: https://github.com/dmlc/dgl
- Beyond Message Passing: A Physics-Inspired Paradigm for Graph Neural Networks
-
[D] Convenient libs to use for new research project at the intersection of GNN and RL.
The best pkg for GCN - https://github.com/dmlc/dgl
What are some alternatives?
BotLibre - An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
pytorch_geometric - Graph Neural Network Library for PyTorch
grape - 🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations
pytorch_geometric_temporal - PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models (CIKM 2021)
ydata-synthetic - Synthetic data generators for tabular and time-series data
torchdrug - A powerful and flexible machine learning platform for drug discovery
misc
spektral - Graph Neural Networks with Keras and Tensorflow 2.
general_class_balancer - Data matching algorithm for categorical and continuous variables
deep_gcns_torch - Pytorch Repo for DeepGCNs (ICCV'2019 Oral, TPAMI'2021), DeeperGCN (arXiv'2020) and GNN1000(ICML'2021): https://www.deepgcns.org
cleanlab - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
SuperGluePretrainedNetwork - SuperGlue: Learning Feature Matching with Graph Neural Networks (CVPR 2020, Oral)