Making the Spark DataFrame composition type safe(r)

This page summarizes the projects mentioned and recommended in the original post on /r/apachespark

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • frameless

    Expressive types for Spark.

  • Valid point! Have you seen the withColumnTupled API? It returns a typed tuple instead. This seems to satisfy your use case - the dataset preserves its type and doesn't require a new case class. This is kind of what you're suggesting but without case class generation. Though not sure whether attribute labels (names) are preserved in this case. It's also unclear whether this is good enough for wide tables.

  • bebe

    Filling in the Spark function gaps across APIs

  • See here for a more detailed discussion and let me know your thoughts!!

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts