Python text-as-data

Open-source Python projects categorized as text-as-data

Top 3 Python text-as-data Projects

text-as-data
  1. scattertext

    Beautiful visualizations of how language differs among document types.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. contextualized-topic-models

    A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

  4. shifterator

    Interpretable data visualizations for understanding how texts differ at the word level

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python text-as-data discussion

Log in or Post with

Python text-as-data related posts

  • Extract words from large data set of reviews by sentiment

    1 project | /r/MLQuestions | 23 May 2022
  • Catogorize the Data- Topic Modelling algorithm

    1 project | /r/LanguageTechnology | 1 Oct 2021

Index

What are some of the best open-source text-as-data projects in Python? This list will help you:

# Project Stars
1 scattertext 2,311
2 contextualized-topic-models 1,242
3 shifterator 280

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com