cuallee
Possibly the fastest DataFrame-agnostic quality check library in town. (by canimus)
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io (by sodadata)
cuallee | soda-core | |
---|---|---|
5 | 5 | |
107 | 1,765 | |
- | 2.3% | |
9.0 | 8.9 | |
6 days ago | 6 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cuallee
Posts with mentions or reviews of cuallee.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-11-30.
- Show HN: Snowflake Data Quality Checks in Python
-
data-diff VS cuallee - a user suggested alternative
2 projects | 30 Nov 2022
Declarative data quality rules at scale
-
deequ VS cuallee - a user suggested alternative
2 projects | 30 Nov 2022
Cuallee offers a faster and optimized version of pydeequ, on the Check API through the use of the new Observation API in pyspark. As well as support to Snowpark, Pandas, Polars and DuckDB dataframe abstractions.
- Show HN: Pyspark and Snowpark and Pandas data quality
- Show HN: Cuallee – pyspark data quality framework for v3.3.0
soda-core
Posts with mentions or reviews of soda-core.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-03-23.
- Looking for Unit Testing framework in Database Migration Process
-
Data profiling tools / approaches?
Tools like Soda Core could be really helpful for this. For example, it allows you to set up a change over time threshold which could take the form of: change avg last 3 for missing_count(column_name) < 20%
-
Data QC? Great Expectations?
You can give https://github.com/sodadata/soda-core - open source and (in my opinion) easy to get a lot of value with minimum effort.
- Show HN: Soda Core is now GA – Test data like you would test your code
-
Soda Core (OSS) is now GA! So, why should you add checks to your data pipelines?
Give Soda Core a try! It's really easy. If you only have 2 minutes, check out our docs or interactive demo (pretty cool no?). If you have a bit more time, install it and give it a spin! Want to look at it later? Star on Github. Got stuck? As in our Slack community.
What are some alternatives?
When comparing cuallee and soda-core you can also consider the following projects:
data-diff - Compare tables within or across databases
great_expectations - Always know what to expect from your data.