gretel-airflow-pipelines vs soda-sql

gretel-airflow-pipelines

Runbooks for running Gretel on Apache Airflow (by gretelai)

Airflow

Source Code

Suggest alternative

Edit details

soda-sql

Data profiling, testing, and monitoring for SQL accessible data. (by sodadata)

data-testing data-monitoring data-quality data-quality-monitoring data-unit-tests data-engineering data-pipeline-monitoring pandas-profiling Data Science Python dbt Airflow airflow-operators Observability data-observability data-profiling soda-sql Metrics Monitoring

DISCONTINUED

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

gretel-airflow-pipelines		soda-sql
	Project
1	Mentions	25
6	Stars	50
-	Growth	-
0.0	Activity	8.2
over 2 years ago	Latest Commit	over 1 year ago
Python	Language	Python
-	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

gretel-airflow-pipelines

Posts with mentions or reviews of gretel-airflow-pipelines. We have used some of these posts to build our list of alternatives and similar projects.

Running Gretel on Apache Airflow - privacy engineering synthetics
1 project | /r/dataengineering | 24 Aug 2021

Here's a link to the Github repo from the blog- https://github.com/gretelai/gretel-airflow-pipelines

soda-sql

Posts with mentions or reviews of soda-sql. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-18.

Data Quality - Great Expectations for Data Engineers
2 projects | /r/dataengineering | 18 Mar 2022

I might be a bit biased, but that was my opinion before even I started contributing to Soda SQL.
dbt vs R/Python for transformation
2 projects | /r/dataengineering | 25 Feb 2022
SodaCL - preview of a new "data reliability as code" language
1 project | /r/dataengineering | 13 Feb 2022

I'm one of the developers of the Open Source soda-sql data quality monitoring library, and over the past year we got some incredible feedback from our users, and based on that we started working on a new DSL for data reliability as code we are calling Soda CL.
How do you test your pipelines?
3 projects | /r/dataengineering | 23 Jan 2022

You can also use soda-sql to do checks on your warehouses separately. Both Soda SQL and Soda Spark are OSS/Apache licensed.
Being constantly shut down by more senior team members when I mention adding some QA in our work
1 project | /r/dataengineering | 10 Jan 2022

As many have said, there might be business side of things to deliver. Somebody above promised delivery with tight deadlines. Trust me, I am not a fan, but this how the world works and it sucks. I would say in your free time, explore tools like greatexpectations.io https://greatexpectations.io/ or https://github.com/sodadata/soda-sql which are modern ways of testing in your learning curve
Soda
1 project | /r/devopspro | 10 Dec 2021
How heavily do you use Great Expectations?
2 projects | /r/dataengineering | 23 Sep 2021
What are some exciting new tools/libraries in 2021?
2 projects | /r/datascience | 20 Jun 2021

soda-sql really cool library to automate data quality checks on SQL tables
How do I incorporate testing after the fact?
1 project | /r/dataengineering | 18 May 2021

Look at SodaSQL. It's more enterprise focused than Great Expectations and you can pipe results to a database for downstream actions and analysis.
Data Testing Tools, Pytest vs Great Expectations vs Soda vs Deequ
2 projects | /r/dataengineering | 17 May 2021

Certainly! It’s not requested that much 😊 but please add an issue on GitHub . I would love to add at least experimental support.

What are some alternatives?

When comparing gretel-airflow-pipelines and soda-sql you can also consider the following projects:

airflow-testing-ci-workflow - (project & tutorial) dag pipeline tests + ci/cd setup

deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

jina - ☁️ Build multimodal AI applications with cloud-native stack

pandera - A light-weight, flexible, and expressive statistical data testing library

Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

sqlfluff - A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

elyra - Elyra extends JupyterLab with an AI centric approach.

dbt-sessionization - Using DBT for Creating Session Abstractions on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.

re_data - re_data - fix data issues before your users & CEO would discover them 😊

trino_data_mesh - Proof of concept on how to gain insights with Trino across different databases from a distributed data mesh

spark-fast-tests - Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

Prefect - The easiest way to build, run, and monitor data pipelines at scale.

gretel-airflow-pipelines vs airflow-testing-ci-workflow soda-sql vs deequ gretel-airflow-pipelines vs jina soda-sql vs pandera gretel-airflow-pipelines vs Airflow soda-sql vs sqlfluff gretel-airflow-pipelines vs elyra soda-sql vs dbt-sessionization soda-sql vs re_data soda-sql vs trino_data_mesh soda-sql vs spark-fast-tests soda-sql vs Prefect

Compare gretel-airflow-pipelines vs soda-sql and see what are their differences.

gretel-airflow-pipelines

soda-sql

gretel-airflow-pipelines

soda-sql

What are some alternatives?