The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 16 Python Data Validation Projects
-
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
deepchecks
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
Encord Active
Open source active learning toolkit to find failure modes in your computer vision models, prioritize data to label next, and drive data curation to improve model performance.
-
python-codicefiscale
:it: :credit_card: italian fiscal codes encoding, decoding and validation - codifica, decodifica e validazione del Codice Fiscale italiano.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: [Research] Detecting Annotation Errors in Semantic Segmentation Data | /r/MachineLearning | 2023-11-05We have feely open-sourced our new method for improving segmentation data, published a paper on the research behind it, and released a 5-min code tutorial. You can also read more in the blog if you'd like.
Project mention: Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks | dev.to | 2024-01-13Also if you have any confusion related to it. You can directly go to their discussion section in github :
Project mention: Show HN: Config-file-validator – CLI tool to validate all your config files | news.ycombinator.com | 2023-09-29I was expecting this to validate the configuration files are also valid for their use cases, not just valid JSON, TOML, etc.
If you're looking for that and Python is your jam, the library cerberus[0] is very good at it.
[0]: https://github.com/pyeve/cerberus
Project mention: Launch HN: Encord (YC W21) – Unit testing for computer vision models | news.ycombinator.com | 2024-01-31We base our pricing on your user and consumption scale and would be happy to discuss this with you directly. Please feel free to explore the OS version of Active at https://github.com/encord-team/encord-active. Note that some features, such as natural language search using GPU accelerated APIs, are not included in the cloud version.
Python Data Validation related posts
- Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks
- Deepchecks: Open-source ML testing and validation library
- Deepchecks' New Open Source is on Product Hunt, and Needs Your Help
- Do you think we need an open-source web scraping monitoring tool?
- [D] Is accurately estimating image quality even possible?
- Python: Data validation
- Deepchecks
-
A note from our sponsor - WorkOS
workos.com | 23 Apr 2024
Index
What are some of the best open-source Data Validation projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | cleanlab | 8,592 |
2 | jsonschema | 4,431 |
3 | deepchecks | 3,338 |
4 | Cerberus | 3,106 |
5 | pandera | 2,994 |
6 | schema | 2,830 |
7 | Schematics | 2,571 |
8 | voluptuous | 1,798 |
9 | soda-core | 1,745 |
10 | cleanvision | 919 |
11 | colander | 440 |
12 | Encord Active | 420 |
13 | valideer | 264 |
14 | python-codicefiscale | 66 |
15 | laravel-validation | 10 |
16 | data_check | 4 |
Sponsored