data_check VS soda-sql

Compare data_check vs soda-sql and see what are their differences.

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
data_check soda-sql
1 25
4 50
- -
8.3 8.2
about 2 months ago over 1 year ago
Python Python
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

data_check

Posts with mentions or reviews of data_check. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-03-18.

soda-sql

Posts with mentions or reviews of soda-sql. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-18.

What are some alternatives?

When comparing data_check and soda-sql you can also consider the following projects:

F2-Data-Pipeline - Pipeline for Automated Updates of Kaggle's "Formula 2 Dataset"

deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

data-validator - A tool to validate data, built around Apache Spark.

pandera - A light-weight, flexible, and expressive statistical data testing library

sqlfluff - A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

dbt-sessionization - Using DBT for Creating Session Abstractions on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.

re_data - re_data - fix data issues before your users & CEO would discover them 😊

trino_data_mesh - Proof of concept on how to gain insights with Trino across different databases from a distributed data mesh

spark-fast-tests - Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

Prefect - The easiest way to build, run, and monitor data pipelines at scale.

dagster - An orchestration platform for the development, production, and observation of data assets.

piperider - Code review for data in dbt