Top 5 Python data-wrangling Projects
-
Optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
-
mongorefine
Experimental headless data wrangling / refining tool over MongoDB, inspired by OpenRefine
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Are there any Python libraries for Data Cleansing ? | /r/dataengineering | 2023-12-08
Index
What are some of the best open-source data-wrangling projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Optimus | 1,441 |
2 | skrub | 1,009 |
3 | prosto | 89 |
4 | pipda | 35 |
5 | mongorefine | 2 |
Sponsored