Python data-quality-checks

Open-source Python projects categorized as data-quality-checks

Top 3 Python data-quality-check Projects

  • soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

  • swiple

    Swiple enables you to easily observe, understand, validate and improve the quality of your data

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • DQC-Toolkit

    Data quality checks to curate noisy labels in the data

  • Project mention: Handling Noisy Labels in Text Classification | dev.to | 2024-04-24

    Instead of building your own DQC, you could also simply use DQC-Toolkit, the open source library mentioned at the beginning of this post, to run quality checks on data. To understand how to use it, let's do a quick demo by extending the reproducibility experiment to DQC Toolkit.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Index

What are some of the best open-source data-quality-check projects in Python? This list will help you:

Project Stars
1 soda-core 1,765
2 swiple 78
3 DQC-Toolkit 0

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com