(NLP) Best practices for topic modeling and generating interesting topics?

This page summarizes the projects mentioned and recommended in the original post on /r/MLQuestions

Civic Auth - Simple auth for Python backends
Drop Civic Auth into your Python backend with just a few lines of code. Email login, SSO, and route protection built-in. Minimal config. Works with FastAPI, Flask, or Django.
www.civic.com
featured
Sevalla - Deploy and host your apps and databases, now with $50 credit!
Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
sevalla.com
featured
  1. contextualized-topic-models

    A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

    If you use CTM, you can provide the topic model two inputs: the preprocessed texts (that will be used by the topic model to generate the topical words) and the unpreprocessed texts (to generate the contextualized representations that will be later concatenated to the document bag-of-word representation). We saw that this slightly improves the performance instead of providing BERT the already-preprocessed text. This feature is supported in the original implementation of CTM, not in OCTIS. See here: https://github.com/MilaNLProc/contextualized-topic-models#combined-topic-model

  2. Civic Auth

    Simple auth for Python backends. Drop Civic Auth into your Python backend with just a few lines of code. Email login, SSO, and route protection built-in. Minimal config. Works with FastAPI, Flask, or Django.

    Civic Auth logo
  3. OCTIS

    OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

    My team and I have recently released a python library called OCTIS (https://github.com/mind-Lab/octis) that allows you to automatically optimize the hyperparameters of a topic model according to a given evaluation metric (not log-likelihood). I guess, in your case, you might be interested in topic coherence. So you will get good quality topics with a low effort on the choice of the hyperparameters. Also, we included some state-of-the-art topic models, e.g. contextualized topic models (https://github.com/MilaNLProc/contextualized-topic-models).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Interpretation of topic modeling results between LDA and BERTopic

    1 project | /r/LanguageTechnology | 18 Sep 2022
  • I am working on a topic modelling paper and I need your help

    1 project | /r/LanguageTechnology | 6 May 2021
  • Latest trends in topic modelling?

    3 projects | /r/LanguageTechnology | 24 Apr 2021
  • OCTIS a python framework to compare and optimize Topic Models

    1 project | /r/learnmachinelearning | 20 Apr 2021
  • OCTIS, our new python framework to optimize and compare topic models has been accepted at EACL2021!

    1 project | /r/learnmachinelearning | 19 Apr 2021

Did you know that Python is
the 2nd most popular programming language
based on number of references?