soda-spark

Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes (by sodadata)

Soda-spark Alternatives

Similar projects and alternatives to soda-spark

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better soda-spark alternative or higher similarity.

soda-spark reviews and mentions

Posts with mentions or reviews of soda-spark. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-01-23.
  • How do you test your pipelines?
    3 projects | /r/dataengineering | 23 Jan 2022
    Since you already have Spark setup, perhaps it would be easier to build a DataFrames by loading data from different tables and validate it in one go ? You can give soda-spark a try (disclosure: I'm one of the developers), using which you can specify your checks using YAML declaratively and run the validations in spark jobs.

Stats

Basic soda-spark repo stats
1
60
0.0
almost 2 years ago

sodadata/soda-spark is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of soda-spark is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com