Our great sponsors
-
dbx
🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
-
spark-fast-tests
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
databricks-nutter-projects-demo
Discontinued Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline [Moved to: https://github.com/alexott/databricks-nutter-repos-demo]
Hey, Databricks person here. Check out DBX for a template on how to do unit and integration tests: https://github.com/databrickslabs/dbx
If the majority of your stuff is not UDF-based there is an OS solution to run assertion tests against full data frames called spark-fast-tests. The idea here is similar in that you have a it notebook that calls your actual notebook against a staged input reads the output and compares it to a prefabed expected output. This does take a bit of setup and trial and error but it’s the closest I’ve been able to get to proper automated regression testing in databricks
You can also use the approach described here https://github.com/alexott/databricks-nutter-projects-demo