[D] Is it better to create a different set of Doc2Vec embeddings for each group in my dataset, rather than generating embeddings for the entire dataset?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Top2Vec

13 2,839 7.0 Python

Top2Vec learns jointly embedded topic, document and word vectors.

I'm using Top2Vec with Doc2Vec embeddings to find topics in a dataset of ~4000 social media posts. This dataset has three groups:

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Tips for best Top2Vec (HDBSCAN) usage
1 project | /r/datascience | 8 Jun 2023
Top2Vec: Embed topics, documents and word vectors
1 project | news.ycombinator.com | 13 May 2022
How to cluster articles about software vulnerabilities?
1 project | /r/MLQuestions | 8 Apr 2022
Ciencia de Dados - Classificacao de texto
1 project | /r/brdev | 17 Feb 2022
Extracting topics from 250k facebook posts
1 project | /r/LanguageTechnology | 26 May 2021

[D] Is it better to create a different set of Doc2Vec embeddings for each group in my dataset, rather than generating embeddings for the entire dataset?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
topic-modeling word-embeddings document-embedding topic-vector topic-search
Post date: 28 Oct 2023

Top2Vec

InfluxDB

Related posts

[D] Is it better to create a different set of Doc2Vec embeddings for each group in my dataset, rather than generating embeddings for the entire dataset?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning topic-modeling word-embeddings document-embedding topic-vector topic-search Post date: 28 Oct 2023

Top2Vec

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
topic-modeling word-embeddings document-embedding topic-vector topic-search
Post date: 28 Oct 2023