Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →
Top 7 Python dataquality Projects
-
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Project mention: Ask HN: Not a webdev, why are these sites so good? | news.ycombinator.com | 2024-06-18https://cleanlab.ai/
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
-
data-observability-installer
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
DataKitchen Data Observability Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
-
fastapi-greatexpectations
Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python dataquality discussion
Python dataquality related posts
-
Data Quality at Scale with Great Expectations, Spark, and Airflow on EMR
-
Soda Core (OSS) is now GA! So, why should you add checks to your data pipelines?
-
Greatexpectations - Always know what to expect from your data.
-
Greatexpectations – Always know what to expect from your data
-
Package for drift detection
-
[D] Do you use data engineering pipelines for real life projects?
-
Launch HN: Elementary (YC W22) – Open-source data observability
-
A note from our sponsor - CodeRabbit
coderabbit.ai | 18 Mar 2025
Index
What are some of the best open-source dataquality projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | great_expectations | 10,253 |
2 | cleanlab | 10,227 |
3 | soda-core | 2,036 |
4 | cuallee | 183 |
5 | data-observability-installer | 108 |
6 | fastapi-greatexpectations | 12 |
7 | data_check | 4 |