The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 3 Python data-quality-monitoring Projects
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
swiple
Swiple enables you to easily observe, understand, validate and improve the quality of your data
If the issue happen a lot, there is also: https://github.com/datafold/data-diff
That is a nice tool to do it cross database as well.
I think it's based on checksum method.
NOTE:
The open source projects on this list are ordered by number of github stars.
The number of mentions indicates repo mentiontions in the last 12 Months or
since we started tracking (Dec 2020).
Python data-quality-monitoring related posts
- Data profiling tools / approaches?
- Data QC? Great Expectations?
- Show HN: Soda Core is now GA – Test data like you would test your code
- Data Quality - Great Expectations for Data Engineers
- dbt vs R/Python for transformation
- SodaCL - preview of a new "data reliability as code" language
- Being constantly shut down by more senior team members when I mention adding some QA in our work
-
A note from our sponsor - WorkOS
workos.com | 25 Apr 2024
Index
What are some of the best open-source data-quality-monitoring projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | data-diff | 2,830 |
2 | soda-core | 1,751 |
3 | swiple | 77 |
Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com