Unit & integration testing in Databricks

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

dbx

5 433 4.6 Python

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

Hey, Databricks person here. Check out DBX for a template on how to do unit and integration tests: https://github.com/databrickslabs/dbx

spark-fast-tests

6 418 0.0 Scala

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

If the majority of your stuff is not UDF-based there is an OS solution to run assertion tests against full data frames called spark-fast-tests. The idea here is similar in that you have a it notebook that calls your actual notebook against a staged input reads the output and compares it to a prefabed expected output. This does take a bit of setup and trial and error but it’s the closest I’ve been able to get to proper automated regression testing in databricks

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
databricks-nutter-projects-demo

1 48 6.9 Python

Discontinued Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline [Moved to: https://github.com/alexott/databricks-nutter-repos-demo]

You can also use the approach described here https://github.com/alexott/databricks-nutter-projects-demo

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project