Managing outdated pull requests is time-consuming. Mergify's Merge Queue automates your pull request management & merging. It's fully integrated to GitHub & coordinated with any CI. Start focusing on code. Try Mergify for free. Learn more →
Hdbscan Alternatives
Similar projects and alternatives to hdbscan
-
-
-
SonarLint
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
-
Milvus
A cloud-native vector database, storage for next generation AI applications
-
homemade-machine-learning
🤖 Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained
-
-
100DaysofMLCode
My journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge. Now supported by bright developers adding their learnings :+1:
-
PyImpetus
PyImpetus is a Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
-
-
-
-
budget
A simply budget app that predicts where the expenses are being made (by victorqribeiro)
-
-
leidenalg
Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
-
Mergify
Tired of breaking your main and manually rebasing outdated pull requests?. Managing outdated pull requests is time-consuming. Mergify's Merge Queue automates your pull request management & merging. It's fully integrated to GitHub & coordinated with any CI. Start focusing on code. Try Mergify for free.
hdbscan reviews and mentions
-
Introducing the Semantic Graph
A number of excellent topic modeling libraries exist in Python today. BERTopic and Top2Vec are two of the most popular. Both use sentence-transformers to encode data into vectors, UMAP for dimensionality reduction and HDBSCAN to cluster nodes.
-
Introduction to K-Means Clustering
Working in spatial data science, I rarely find applications where k-means is the best tool. The problem is that it is difficult to know how many clusters you can expect on maps. Is it 5, 500, or 10,000? Here HDBSCAN [1] shines because it will cluster _and_ select the most suitable number of clusters, to cut the single linkage cluster tree.
-
[D] Good algorithm for clustering big data (sentences represented as embeddings)?
Maybe use (H)DBScan which I think should work also for huge datasets. I don't think there is a ready to use clustering with unbuild cosine similarily metrics, and you also won't be able to precompute the 100k X 100k dense similarity matrix. The only way to go on this is to L2 normalize your embeddings, then the dot product will be the angular distance as a proxy to the cosine similarily. See also https://github.com/scikit-learn-contrib/hdbscan/issues/69
-
A note from our sponsor - Mergify
blog.mergify.com | 22 Sep 2023
Stats
scikit-learn-contrib/hdbscan is an open source project licensed under BSD 3-clause "New" or "Revised" License which is an OSI approved license.
The primary programming language of hdbscan is Jupyter Notebook.