soda-sql VS data_check

Compare soda-sql vs data_check and see what are their differences.

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
soda-sql data_check
25 1
50 4
- -
8.2 8.3
over 1 year ago about 1 month ago
Python Python
Apache License 2.0 MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

soda-sql

Posts with mentions or reviews of soda-sql. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-18.

data_check

Posts with mentions or reviews of data_check. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-03-18.

What are some alternatives?

When comparing soda-sql and data_check you can also consider the following projects:

deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

F2-Data-Pipeline - Pipeline for Automated Updates of Kaggle's "Formula 2 Dataset"

pandera - A light-weight, flexible, and expressive statistical data testing library

data-validator - A tool to validate data, built around Apache Spark.

sqlfluff - A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

dbt-sessionization - Using DBT for Creating Session Abstractions on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.

re_data - re_data - fix data issues before your users & CEO would discover them 😊

trino_data_mesh - Proof of concept on how to gain insights with Trino across different databases from a distributed data mesh

spark-fast-tests - Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

Prefect - The easiest way to build, run, and monitor data pipelines at scale.

piperider - Code review for data in dbt

dagster - An orchestration platform for the development, production, and observation of data assets.