Testing spark applications

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • chispa

    PySpark test helper methods with beautiful error messages

  • Unit and e2e tests using a combination of pytest and chispa (https://github.com/MrPowers/chispa). Custom library to create random test data that fits schema with optional hardcoded overrides for relevant fields to test business logic.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Spark open source community is awesome

    5 projects | /r/apachespark | 29 Dec 2022
  • Invitation to collaborate on open source PySpark projects

    3 projects | /r/apachespark | 15 Oct 2022
  • installing pyspark on my m1 mac, getting an env error

    2 projects | /r/apachespark | 4 Jun 2022
  • Spark: local dev environment

    2 projects | /r/dataengineering | 7 Feb 2022
  • Pyspark now provides a native Pandas API

    3 projects | /r/Python | 2 Jan 2022