PySpark for NLP Workshop - Materials and Jupyter Notebooks

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • spark-nlp

    State of the Art Natural Language Processing

    I recently had the opportunity to run a workshop at ODSC East, focusing on using PySpark for Natural Language Processing (NLP). Had a great time explaining PySpark's fundamentals and exploring the Spark NLP library.

  • pyspark_nlp_workshop

    Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"

    For anyone interested, I've made the Jupyter notebooks from the workshop available, complete with instructions to get them up and running on Databricks. You can find them here:

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts