Data Validation

Open-source projects categorized as Data Validation

Top 23 Data Validation Open-Source Projects

  • Yup

    Dead simple Object schema validation

    Project mention: Using React Select with Formik | dev.to | 2024-03-18

    I was recently building an application that, among other features, allows a user to submit chess players and chess games to a database. I was utilizing Yup for form schema and Formik for error handling, validation, and form submission.

  • react-jsonschema-form

    A React component for building Web forms from JSON Schema.

    Project mention: Framework Interoperable Component Libraries Using Lit Web Components. | dev.to | 2023-10-08

    I've been very passionate about a project called react-jsonschema-form (github, editor). I personally hate writing forms, and love the idea of serializable components, schema, validation all in one. I've always wanted an alternative to this project that offered an alternative to react, and possibly the ability to render a schema form to static HTML (like ssg).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • ajv

    The fastest JSON schema Validator. Supports JSON Schema draft-04/06/07/2019-09/2020-12 and JSON Type Definition (RFC8927)

    Project mention: 6 Reasons why JSON Schema is worth your time | dev.to | 2023-10-03

    In the JavaScript ecosystem you can use the excellent AJV package to validate any JavaScript object against a JSON schema. This is especially useful to ensure that API contracts are maintained when communicating with other services.

  • cleanlab

    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

    Project mention: [Research] Detecting Annotation Errors in Semantic Segmentation Data | /r/MachineLearning | 2023-11-05

    We have feely open-sourced our new method for improving segmentation data, published a paper on the research behind it, and released a 5-min code tutorial. You can also read more in the blog if you'd like.

  • Superstruct

    A simple and composable way to validate data in JavaScript (and TypeScript).

    Project mention: Lessons from open-source: Replace zod with superstruct if you do not use zod’s advanced capabilities | dev.to | 2024-03-26

    This is where I saw compiled folder has superstruct’s minified code.

  • jsonschema

    An implementation of the JSON Schema specification for Python

  • deepchecks

    Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.

    Project mention: Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks | dev.to | 2024-01-13

    Also if you have any confusion related to it. You can directly go to their discussion section in github :

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • Cerberus

    Lightweight, extensible data validation library for Python (by pyeve)

    Project mention: Show HN: Config-file-validator – CLI tool to validate all your config files | news.ycombinator.com | 2023-09-29

    I was expecting this to validate the configuration files are also valid for their use cases, not just valid JSON, TOML, etc.

    If you're looking for that and Python is your jam, the library cerberus[0] is very good at it.

    [0]: https://github.com/pyeve/cerberus

  • pandera

    A light-weight, flexible, and expressive statistical data testing library

  • schema

    Schema validation just got Pythonic

  • Schematics

    Python Data Structures for Humans™.

  • JSON-Splora

    GUI app for editing, visualizing, and manipulating JSON data

  • voluptuous

    CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

  • soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

  • forgJs

    ForgJs is a javascript lightweight object validator.

  • dry-validation

    Validation library with type-safe schemas and rules

    Project mention: Payload and parameters validation in Rails | dev.to | 2024-03-09

    Luckily, there is a large pool of community wisdom around and outside of Rails which may help us a lot here. Instead of inventing our own wheel for now we will use one invented before us by others. Pretty much sure you have seen this magic used outside of Hogwarts before: https://dry-rb.org/gems/dry-validation.

  • tv4

    Tiny Validator for JSON Schema v4

  • is-my-json-valid

    A JSONSchema validator that uses code generation to be extremely fast

  • cleanvision

    Automatically find issues in image datasets and practice data-centric computer vision.

    Project mention: [D] Is accurately estimating image quality even possible? | /r/MachineLearning | 2023-04-22

    Github: https://github.com/cleanlab/cleanvision Blogpost: https://cleanlab.ai/blog/cleanvision/

  • pointblank

    Data quality assessment and metadata reporting for data frames and database tables

    Project mention: R: Introduction to Data Science | news.ycombinator.com | 2024-03-02

    (1) You might want to check out https://github.com/t-kalinowski/Rapp by my colleague Tomasz

    (2) I think part of that is in scope for strict (https://github.com/hadley/strict). You might also be well served by adopting some more data validation tooling, e.g. pointblank (https://rstudio.github.io/pointblank/).

  • schema-inspector

    Schema-Inspector is a simple JavaScript object sanitization and validation module.

  • colander

    A serialization/deserialization/validation library for strings, mappings and lists.

  • Encord Active

    Open source active learning toolkit to find failure modes in your computer vision models, prioritize data to label next, and drive data curation to improve model performance.

    Project mention: Launch HN: Encord (YC W21) – Unit testing for computer vision models | news.ycombinator.com | 2024-01-31

    We base our pricing on your user and consumption scale and would be happy to discuss this with you directly. Please feel free to explore the OS version of Active at https://github.com/encord-team/encord-active. Note that some features, such as natural language search using GPU accelerated APIs, are not included in the cloud version.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-26.

Data Validation related posts

Index

What are some of the best open-source Data Validation projects? This list will help you:

Project Stars
1 Yup 22,080
2 react-jsonschema-form 13,562
3 ajv 13,261
4 cleanlab 8,125
5 Superstruct 6,787
6 jsonschema 4,411
7 deepchecks 3,289
8 Cerberus 3,100
9 pandera 2,919
10 schema 2,827
11 Schematics 2,569
12 JSON-Splora 1,856
13 voluptuous 1,798
14 soda-core 1,724
15 forgJs 1,666
16 dry-validation 1,313
17 tv4 1,161
18 is-my-json-valid 955
19 cleanvision 910
20 pointblank 814
21 schema-inspector 504
22 colander 440
23 Encord Active 418
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com