Python data-wrangling

Open-source Python projects categorized as data-wrangling

Top 6 Python data-wrangling Projects

data-wrangling
  1. datachain

    ETL, Analytics, Versioning for Unstructured Data

    Project mention: DBT for Unstructured Data – DataChain | news.ycombinator.com | 2024-11-04
  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. Optimus

    :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)

  4. skrub

    Prepping tables for machine learning

  5. prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  6. pipda

    A framework for data piping in python

  7. mongorefine

    Experimental headless data wrangling / refining tool over MongoDB, inspired by OpenRefine

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python data-wrangling discussion

Log in or Post with

Python data-wrangling related posts

Index

What are some of the best open-source data-wrangling projects in Python? This list will help you:

# Project Stars
1 datachain 2,205
2 Optimus 1,488
3 skrub 1,270
4 prosto 91
5 pipda 37
6 mongorefine 2

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?