SaaSHub helps you find the best software and product alternatives Learn more β
Top 6 Datacleaning Open-Source Projects
-
OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
dataprep
Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
-
yobulkdev
π₯ π₯ π₯Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative
Project mention: Ask HN: What Underrated Open Source Project Deserves More Recognition? | news.ycombinator.com | 2024-03-07"OpenRefine is a powerful free, open source tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data." https://openrefine.org/
Datacleaning related posts
-
Data Quality at Scale with Great Expectations, Spark, and Airflow on EMR
-
Soda Core (OSS) is now GA! So, why should you add checks to your data pipelines?
-
Greatexpectations - Always know what to expect from your data.
-
Greatexpectations β Always know what to expect from your data
-
[D] Do you use data engineering pipelines for real life projects?
-
Just starting to get into automated testing, should I be looking for a dedicated tool or library for data engineering specifically?
-
What Do You Do To Invalid Data In Your Pipeline
-
A note from our sponsor - SaaSHub
www.saashub.com | 6 May 2024
Index
What are some of the best open-source Datacleaning projects? This list will help you:
Project | Stars | |
---|---|---|
1 | OpenRefine | 10,498 |
2 | great_expectations | 9,479 |
3 | dataprep | 1,920 |
4 | yobulkdev | 850 |
5 | pandas-data-cleaner | 5 |
6 | PARSE-CLIP | 3 |
Sponsored