Python Data Validation

Open-source Python projects categorized as Data Validation

Top 16 Python Data Validation Projects

Data Validation
  • cleanlab

    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

  • Project mention: Ask HN: Not a webdev, why are these sites so good? | news.ycombinator.com | 2024-06-18

    https://cleanlab.ai/

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • jsonschema

    An implementation of the JSON Schema specification for Python

  • deepchecks

    Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.

  • Project mention: Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks | dev.to | 2024-01-13

    Also if you have any confusion related to it. You can directly go to their discussion section in github :

  • Cerberus

    Lightweight, extensible data validation library for Python (by pyeve)

  • Project mention: Show HN: Config-file-validator – CLI tool to validate all your config files | news.ycombinator.com | 2023-09-29

    I was expecting this to validate the configuration files are also valid for their use cases, not just valid JSON, TOML, etc.

    If you're looking for that and Python is your jam, the library cerberus[0] is very good at it.

    [0]: https://github.com/pyeve/cerberus

  • pandera

    A light-weight, flexible, and expressive statistical data testing library

  • schema

    Schema validation just got Pythonic

  • Schematics

    Python Data Structures for Humans™.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • voluptuous

    CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

  • soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

  • cleanvision

    Automatically find issues in image datasets and practice data-centric computer vision.

  • colander

    A serialization/deserialization/validation library for strings, mappings and lists.

  • Encord Active

    Open source active learning toolkit to find failure modes in your computer vision models, prioritize data to label next, and drive data curation to improve model performance.

  • Project mention: Launch HN: Encord (YC W21) – Unit testing for computer vision models | news.ycombinator.com | 2024-01-31

    We base our pricing on your user and consumption scale and would be happy to discuss this with you directly. Please feel free to explore the OS version of Active at https://github.com/encord-team/encord-active. Note that some features, such as natural language search using GPU accelerated APIs, are not included in the cloud version.

  • valideer

    Lightweight data validation and adaptation Python library.

  • python-codicefiscale

    :it: :credit_card: italian fiscal codes encoding, decoding and validation - codifica, decodifica e validazione del Codice Fiscale italiano.

  • laravel-validation

    A PHP Laravel like validation for python language

  • data_check

    data and pipeline testing with and for SQL

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Data Validation discussion

Log in or Post with

Python Data Validation related posts

  • Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks

    1 project | dev.to | 13 Jan 2024
  • Deepchecks: Open-source ML testing and validation library

    1 project | news.ycombinator.com | 11 Sep 2023
  • Deepchecks' New Open Source is on Product Hunt, and Needs Your Help

    3 projects | /r/deeplearning | 18 Jun 2023
  • Do you think we need an open-source web scraping monitoring tool?

    2 projects | /r/webscraping | 6 May 2023
  • [D] Is accurately estimating image quality even possible?

    3 projects | /r/MachineLearning | 22 Apr 2023
  • Python: Data validation

    5 projects | dev.to | 20 Jan 2023
  • Deepchecks

    1 project | /r/devopspro | 22 Aug 2022
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 19 Jun 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Data Validation projects in Python? This list will help you:

Project Stars
1 cleanlab 8,913
2 jsonschema 4,483
3 deepchecks 3,452
4 Cerberus 3,119
5 pandera 3,097
6 schema 2,850
7 Schematics 2,577
8 voluptuous 1,806
9 soda-core 1,805
10 cleanvision 937
11 colander 445
12 Encord Active 425
13 valideer 264
14 python-codicefiscale 67
15 laravel-validation 10
16 data_check 4

Sponsored
Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com