kmodes
MinSizeKmeans
Our great sponsors
kmodes | MinSizeKmeans | |
---|---|---|
2 | 1 | |
1,218 | 80 | |
- | - | |
4.9 | 0.0 | |
3 months ago | about 3 years ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kmodes
- kmodes, Python package for categorical clustering releases version 0.12.0. Now with sample weighting and Python 3.10 support.
-
How much of data science is lying?
They were probably looking for K-modes
MinSizeKmeans
-
Preliminary Evidence that Retail Trades can be Identified and Counted on the Tape
Sure so basically I scraped the volume from the SEC report figure depicting 'Buy' volume by measuring pixels between ticks on the y axis. I also downloaded all regular session trades for the dates in that figure and after some rounds of data cleaning (ie one hot encoding trade condition data) I ran the trades from the first half through an implementation of kmeans clustering with minimum cluster size constraints set to cluster the trades weighted by volume into 2 groups with minimum weight of the volume scraped from the candle from the SEC report. This clustering takes a long time to run so I've only managed to process that first bar.
What are some alternatives?
yellowbrick - Visual analysis and diagnostic tools to facilitate machine learning model selection.
Clustering4Ever - C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
best-of-ml-python - 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
MAGIST-Algorithm - Multi-Agent Generally Intelligent Simultaneous Training Algorithm for Project Zeta
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
PSOClustering - This is an implementation of clustering IRIS dataset with particle swarm optimization(PSO)
Dask - Parallel computing with task scheduling
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis
leidenalg - Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
fuzzy-c-means - A simple python implementation of Fuzzy C-means algorithm.