data_check
F2-Data-Pipeline
data_check | F2-Data-Pipeline | |
---|---|---|
1 | 1 | |
4 | 8 | |
- | - | |
8.3 | 5.9 | |
about 2 months ago | 9 months ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data_check
-
Anyone aware of any Data Validation Framework with custom SQL capability
Maybe this can help: https://github.com/andrjas/data_check
F2-Data-Pipeline
What are some alternatives?
soda-sql - Data profiling, testing, and monitoring for SQL accessible data.
ethereum-etl-airflow - Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee
data-validator - A tool to validate data, built around Apache Spark.
covid-19-data-engineering-pipeline - A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
Mage - 🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
great_expectations - Always know what to expect from your data.
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.