Introduction to K-Means Clustering

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • hdbscan

    A high performance implementation of HDBSCAN clustering.

  • Working in spatial data science, I rarely find applications where k-means is the best tool. The problem is that it is difficult to know how many clusters you can expect on maps. Is it 5, 500, or 10,000? Here HDBSCAN [1] shines because it will cluster _and_ select the most suitable number of clusters, to cut the single linkage cluster tree.

    [1]: https://github.com/scikit-learn-contrib/hdbscan

  • ckwrap

    Wrapper for Ckmeans.1d.dp.

  • Note also that specifically for one-dimensional data, there is a globally optimal solution to the k-means clustering problem. There is an R package that implements it using a C++ core implementation [1], and also a Python wrapper [2].

    [1]: https://cran.r-project.org/package=Ckmeans.1d.dp

    [2]: https://github.com/djdt/ckwrap

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • groupImg

    A script in python to organize your images by similarity.

  • If anyone is interested, I have two projects that uses k-means

    https://github.com/victorqribeiro/groupImg

    https://github.com/victorqribeiro/budget

    Being one of the first ML algorithms that I learned, I spend some time finding use cases for it

    If I'm not mistaken I've also used in to classify deforestation in an exercise

  • budget

    A simply budget app that predicts where the expenses are being made (by victorqribeiro)

  • If anyone is interested, I have two projects that uses k-means

    https://github.com/victorqribeiro/groupImg

    https://github.com/victorqribeiro/budget

    Being one of the first ML algorithms that I learned, I spend some time finding use cases for it

    If I'm not mistaken I've also used in to classify deforestation in an exercise

  • word2vec

    Automatically exported from code.google.com/p/word2vec

  • It is not necessarily the case.

    For example, word2vec uses k-means clustering using cosine similarity measure [1]. It works very, very well. The caveat is not many optimization variations of k-means will work with that "distance".

    [1] https://github.com/tmikolov/word2vec/blob/master/word2vec.c#...

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • ✨ 5 Best GitHub Repositories to Learn Machine Learning in 2022 for Free 💯

    5 projects | /r/learnmachinelearning | 14 Oct 2022
  • Intel Extension for Scikit-Learn

    4 projects | news.ycombinator.com | 1 Nov 2021
  • Difficulty in using LSTMs for text generation

    1 project | /r/pytorch | 23 May 2021
  • GPU Based Kernel-PCA

    2 projects | /r/MLQuestions | 22 Jan 2021
  • Show HN: I made NIGHT.FM, a cyberpunk-inspired online radio

    1 project | news.ycombinator.com | 15 Jan 2021