Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit) (by MrPowers)

Spark-fast-tests Alternatives

Similar projects and alternatives to spark-fast-tests

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better spark-fast-tests alternative or higher similarity.

spark-fast-tests reviews and mentions

Posts with mentions or reviews of spark-fast-tests. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-13.
  • Lakehouse architecture in Azure Synapse without Databricks?
    2 projects | /r/dataengineering | 13 Apr 2023
    I was a Databricks user for 5 years and spent 95% of my time developing Spark code in IDEs. See the spark-daria and spark-fast-tests projects as Scala examples. I developed internal libraries with all the business logic. The Databricks notebooks would consist of a few lines of code that would invoke a function in the proprietary Spark codebase. The proprietary Spark codebase would depend on the OSS libraries I developed in parallel.
  • Well designed scala/spark project
    4 projects | /r/scala | 15 Oct 2022
  • Unit & integration testing in Databricks
    3 projects | /r/dataengineering | 30 Apr 2022
    If the majority of your stuff is not UDF-based there is an OS solution to run assertion tests against full data frames called spark-fast-tests. The idea here is similar in that you have a it notebook that calls your actual notebook against a staged input reads the output and compares it to a prefabed expected output. This does take a bit of setup and trial and error but it’s the closest I’ve been able to get to proper automated regression testing in databricks
  • Show dataengineering: beavis, a library for unit testing Pandas/Dask code
    3 projects | /r/dataengineering | 9 Aug 2021
    I am the author of spark-fast-tests and chispa, libraries for unit testing Scala Spark / PySpark code.
  • Ask HN: What are some tools / libraries you built yourself?
    264 projects | | 16 May 2021
    I built daria ( to make it easier to write Spark and spark-fast-tests ( to provide a good testing workflow.

    quinn ( and chispa ( are the PySpark equivalents.

    Built bebe ( to expose the Spark Catalyst expressions that aren't exposed to the Scala / Python APIs.

    Also build spark-sbt.g8 to create a Spark project with a single command:

  • Open source contributions for a Data Engineer?
    17 projects | /r/dataengineering | 16 Apr 2021
    I've built popular PySpark (quinn, chispa) and Scala Spark (spark-daria, spark-fast-tests) libraries.
  • A note from our sponsor - InfluxDB | 27 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more β†’


Basic spark-fast-tests repo stats
29 days ago

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives