Latest trends in topic modelling?

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • contextualized-topic-models

    A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

  • Cross-lingual Contextualized Topic Models with Zero-shot Learning from a team at MilaNLP which uses bag of words representations in combination with multi lingual embeddings from SBERT and works like a VAE (encode the input, use the encoded representation to decode back to a bag of words as close to the input as possible). Using SBERT embeddings makes their model generalise for other languages which may be useful. One major shortfall of this model as I understand is that it can't deal with long documents very elegantly - only up to BERT'S word limit (the workaround is to truncate and use the first words)

  • OCTIS

    OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

  • Silvia Terragni (a coauthor on the above) also brought a topic modelling library OCTIS which was exhibited as a demo paper and aims to be the huggingface transformers of topic modelling - it includes wrappers around the above model as well as and LDA and some baselines as well as some tools and frameworks for comparing them.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • BERTopic

    Leveraging BERT and c-TF-IDF to create easily interpretable topics.

  • BERT actually does pretty decent!

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts