SaaSHub helps you find the best software and product alternatives Learn more β
Soda-sql Alternatives
Similar projects and alternatives to soda-sql
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Metabase
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
-
mljar-supervised
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
-
sqlfluff
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
-
ballista
Discontinued Distributed compute platform implemented in Rust, and powered by Apache Arrow.
-
spark-fast-tests
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
-
dbt-sessionization
Using DBT for Creating Session Abstractions on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.
-
trino_data_mesh
Proof of concept on how to gain insights with Trino across different databases from a distributed data mesh
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
soda-sql reviews and mentions
-
Data Quality - Great Expectations for Data Engineers
I might be a bit biased, but that was my opinion before even I started contributing to Soda SQL.
- dbt vs R/Python for transformation
-
SodaCL - preview of a new "data reliability as code" language
I'm one of the developers of the Open Source soda-sql data quality monitoring library, and over the past year we got some incredible feedback from our users, and based on that we started working on a new DSL for data reliability as code we are calling Soda CL.
-
How do you test your pipelines?
You can also use soda-sql to do checks on your warehouses separately. Both Soda SQL and Soda Spark are OSS/Apache licensed.
-
Being constantly shut down by more senior team members when I mention adding some QA in our work
As many have said, there might be business side of things to deliver. Somebody above promised delivery with tight deadlines. Trust me, I am not a fan, but this how the world works and it sucks. I would say in your free time, explore tools like greatexpectations.io https://greatexpectations.io/ or https://github.com/sodadata/soda-sql which are modern ways of testing in your learning curve
- Soda
- How heavily do you use Great Expectations?
-
What are some exciting new tools/libraries in 2021?
soda-sql really cool library to automate data quality checks on SQL tables
-
How do I incorporate testing after the fact?
Look at SodaSQL. It's more enterprise focused than Great Expectations and you can pipe results to a database for downstream actions and analysis.
-
Data Testing Tools, Pytest vs Great Expectations vs Soda vs Deequ
Certainly! Itβs not requested that much π but please add an issue on GitHub . I would love to add at least experimental support.
-
A note from our sponsor - SaaSHub
www.saashub.com | 7 May 2024
Stats
sodadata/soda-sql is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of soda-sql is Python.
Sponsored