Python Data Validation

Open-source Python projects categorized as Data Validation

Top 18 Python Data Validation Projects

Data Validation
  1. cleanlab

    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

    Project mention: Ask HN: Not a webdev, why are these sites so good? | news.ycombinator.com | 2024-06-18

    https://cleanlab.ai/

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. jsonschema

    An implementation of the JSON Schema specification for Python

  4. deepchecks

    Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.

  5. pandera

    A light-weight, flexible, and expressive statistical data testing library

  6. Cerberus

    Lightweight, extensible data validation library for Python (by pyeve)

  7. schema

    Schema validation just got Pythonic

  8. Schematics

    Python Data Structures for Humans™.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

  11. voluptuous

    CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

  12. cleanvision

    Automatically find issues in image datasets and practice data-centric computer vision.

  13. colander

    A serialization/deserialization/validation library for strings, mappings and lists.

  14. Encord Active

    Open source active learning toolkit to find failure modes in your computer vision models, prioritize data to label next, and drive data curation to improve model performance.

  15. valideer

    Lightweight data validation and adaptation Python library.

  16. python-codicefiscale

    :it: :credit_card: italian fiscal codes encoding, decoding and validation - codifica, decodifica e validazione del Codice Fiscale italiano.

  17. Validoopsie

    A simple and easy to use Data Validation library for Python.

    Project mention: All Data and AI Weekly 179 - 03-March-2025 | dev.to | 2025-03-03

    ❄️ https://github.com/akmalsoliev/Validoopsie

  18. snowflake-provisioning

    Snowflake Database, Schema, and Warehouse provisioning with Access Roles & Generating and Provisioning of Functional Roles & Snowflake Source Export, Snowflake cloning, and data tieout tool

  19. laravel-validation

    A PHP Laravel like validation for python language

  20. data_check

    data and pipeline testing with and for SQL

  21. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Data Validation discussion

Log in or Post with

Python Data Validation related posts

  • Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks

    1 project | dev.to | 13 Jan 2024
  • Deepchecks: Open-source ML testing and validation library

    1 project | news.ycombinator.com | 11 Sep 2023
  • Deepchecks' New Open Source is on Product Hunt, and Needs Your Help

    3 projects | /r/deeplearning | 18 Jun 2023
  • Do you think we need an open-source web scraping monitoring tool?

    2 projects | /r/webscraping | 6 May 2023
  • [D] Is accurately estimating image quality even possible?

    3 projects | /r/MachineLearning | 22 Apr 2023
  • Python: Data validation

    5 projects | dev.to | 20 Jan 2023
  • Deepchecks

    1 project | /r/devopspro | 22 Aug 2022
  • A note from our sponsor - SaaSHub
    www.saashub.com | 26 Mar 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Data Validation projects in Python? This list will help you:

# Project Stars
1 cleanlab 10,241
2 jsonschema 4,723
3 deepchecks 3,742
4 pandera 3,698
5 Cerberus 3,198
6 schema 2,904
7 Schematics 2,582
8 soda-core 2,043
9 voluptuous 1,831
10 cleanvision 1,058
11 colander 456
12 Encord Active 448
13 valideer 263
14 python-codicefiscale 77
15 Validoopsie 58
16 snowflake-provisioning 42
17 laravel-validation 12
18 data_check 4

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that Python is
the 2nd most popular programming language
based on number of references?