data-validator
A tool to validate data, built around Apache Spark. (by target)
data_check
data and pipeline testing with and for SQL (by andrjas)
data-validator | data_check | |
---|---|---|
2 | 1 | |
95 | 4 | |
- | - | |
7.4 | 8.3 | |
19 days ago | about 2 months ago | |
Scala | Python | |
GNU General Public License v3.0 or later | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data-validator
Posts with mentions or reviews of data-validator.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-03-18.
data_check
Posts with mentions or reviews of data_check.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-03-18.
-
Anyone aware of any Data Validation Framework with custom SQL capability
Maybe this can help: https://github.com/andrjas/data_check
What are some alternatives?
When comparing data-validator and data_check you can also consider the following projects:
soda-sql - Data profiling, testing, and monitoring for SQL accessible data.
mmlspark - Simple and Distributed Machine Learning [Moved to: https://github.com/microsoft/SynapseML]
F2-Data-Pipeline - Pipeline for Automated Updates of Kaggle's "Formula 2 Dataset"
data-caterer - Data generation and validation tool for any data source
HAMB