
InfluxDB
Power RealTime Data Analytics at Scale. Get realtime insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in realtime with unbounded cardinality.

SaaSHub
SaaSHub  Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Working in spatial data science, I rarely find applications where kmeans is the best tool. The problem is that it is difficult to know how many clusters you can expect on maps. Is it 5, 500, or 10,000? Here HDBSCAN [1] shines because it will cluster _and_ select the most suitable number of clusters, to cut the single linkage cluster tree.
[1]: https://github.com/scikitlearncontrib/hdbscan
Note also that specifically for onedimensional data, there is a globally optimal solution to the kmeans clustering problem. There is an R package that implements it using a C++ core implementation [1], and also a Python wrapper [2].
[1]: https://cran.rproject.org/package=Ckmeans.1d.dp
[2]: https://github.com/djdt/ckwrap
If anyone is interested, I have two projects that uses kmeans
https://github.com/victorqribeiro/groupImg
https://github.com/victorqribeiro/budget
Being one of the first ML algorithms that I learned, I spend some time finding use cases for it
If I'm not mistaken I've also used in to classify deforestation in an exercise
If anyone is interested, I have two projects that uses kmeans
https://github.com/victorqribeiro/groupImg
https://github.com/victorqribeiro/budget
Being one of the first ML algorithms that I learned, I spend some time finding use cases for it
If I'm not mistaken I've also used in to classify deforestation in an exercise
It is not necessarily the case.
For example, word2vec uses kmeans clustering using cosine similarity measure [1]. It works very, very well. The caveat is not many optimization variations of kmeans will work with that "distance".
[1] https://github.com/tmikolov/word2vec/blob/master/word2vec.c#...