Python data-preprocessing

Open-source Python projects categorized as data-preprocessing

Top 4 Python data-preprocessing Projects

data-preprocessing
  1. skrub

    Prepping tables for machine learning

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  4. degradr

    Python library for realistically degrading images.

  5. dataclr

    Feature selection for tabular datasets using advanced filter and wrapper methods

    Project mention: Show HN: Dataclr – Python library simplifying feature selection for ML | news.ycombinator.com | 2025-01-06
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python data-preprocessing discussion

Log in or Post with

Python data-preprocessing related posts

  • Show HN: Dataclr – Python library simplifying feature selection for ML

    1 project | news.ycombinator.com | 6 Jan 2025
  • Dataclr – New feature selection algorithm for ML achieving SOTA results

    1 project | news.ycombinator.com | 5 Jan 2025
  • Framework for Data ETL with multiple export templates ?

    1 project | /r/Python | 14 Jul 2021
  • convtools - define conversions, aggregations and joins in functional style (ad-hoc code generation)

    1 project | /r/Python | 8 Jul 2021
  • Writing concise functional code in python

    2 projects | /r/Python | 6 Jul 2021
  • A note from our sponsor - SaaSHub
    www.saashub.com | 17 Jan 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source data-preprocessing projects in Python? This list will help you:

# Project Stars
1 skrub 1,270
2 prosto 91
3 degradr 18
4 dataclr 12

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?