Our great sponsors
-
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
There's no straightforward way to drop and rerun a metric collection. For example, say you detect a problem in your data. You fix it, rerun the pipeline, and replace the bad data with the good. You'd want your metrics history to reflect the true state of your data. But the "bad run" cannot be dropped. Issue
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- Building a data quality solution for devs and business people
-
deequ VS cuallee - a user suggested alternative
2 projects | 30 Nov 2022
- Well designed scala/spark project
- Congrats on hitting the v1 milestone, whylabs! You're r/MLOps OSS tool of the month!
- PySpark - How to get Corrupted Records after Casting