Python feature-engineering

Open-source Python projects categorized as feature-engineering

Top 23 Python feature-engineering Projects

feature-engineering
  1. featuretools

    An open source python library for automated feature engineering

  2. Civic Auth

    Simple auth for Python backends. Drop Civic Auth into your Python backend with just a few lines of code. Email login, SSO, and route protection built-in. Minimal config. Works with FastAPI, Flask, or Django.

    Civic Auth logo
  3. mljar-supervised

    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

    Project mention: Python, notebooks, no code recipes, AI = new desktop app for data analysis | news.ycombinator.com | 2025-06-01
  4. intelligent-trading-bot

    Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

  5. functime

    Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

  6. NVTabular

    NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

  7. tsfel

    An intuitive library to extract features from time series.

  8. evalml

    EvalML is an AutoML library written in python.

  9. Sevalla

    Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

    Sevalla logo
  10. pixeltable

    Pixeltable β€” AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.

    Project mention: Stop Gluing Data Infrastructure Tools: Build Multimodal AI Workloads and Application with One Declarative Python SDK | dev.to | 2025-07-06

    Star us on GitHub: https://github.com/pixeltable/pixeltable

  11. temporian

    Temporian is an open-source Python library for preprocessing ⚑ and feature engineering πŸ›  temporal data πŸ“ˆ for machine learning applications πŸ€–

  12. Hyperactive

    An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

  13. tsflex

    Flexible time series feature extraction & processing

  14. hrv-analysis

    Package for Heart Rate Variability analysis in Python

  15. feathub

    FeatHub - A stream-batch unified feature store for real-time machine learning

  16. upgini

    Data search & enrichment library for Machine Learning β†’ Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  17. CAAFE

    Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, MΓΌller, and Hutter (2023).

  18. NitroFE

    NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

  19. prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  20. bytehub

    ByteHub: making feature stores simple

  21. ds2

    Easiest way to use AI models without coding (Web UI & API support) (by DS2BRAIN)

  22. dpq

    dpq is an open-source python library that makes prompt-based data transformations and feature engineering easy

  23. lambdo

    Feature engineering and machine learning: together at last!

  24. HDB_Resale_Prices

    Predicted and identified the drivers of Singapore HDB resale prices (2015-2019) with 0.96 Rsquare & $20,000 MAE. Web app deployment using Streamlit for user price prediction.

  25. dataclr

    Feature selection for tabular datasets using advanced filter and wrapper methods

    Project mention: Show HN: Dataclr – Python library simplifying feature selection for ML | news.ycombinator.com | 2025-01-06
  26. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python feature-engineering discussion

Log in or Post with

Python feature-engineering related posts

  • Show HN: Dataclr – Python library simplifying feature selection for ML

    1 project | news.ycombinator.com | 6 Jan 2025
  • Dataclr – New feature selection algorithm for ML achieving SOTA results

    1 project | news.ycombinator.com | 5 Jan 2025
  • Temporian: Google's Python package for time series preprocessing

    1 project | news.ycombinator.com | 13 Feb 2024
  • temporian: NEW Data - star count:283.0

    1 project | /r/algoprojects | 21 Nov 2023
  • temporian: NEW Data - star count:283.0

    1 project | /r/algoprojects | 20 Nov 2023
  • temporian: NEW Data - star count:283.0

    1 project | /r/algoprojects | 19 Nov 2023
  • temporian: NEW Data - star count:283.0

    1 project | /r/algoprojects | 17 Nov 2023
  • A note from our sponsor - Sevalla
    sevalla.com | 31 Aug 2025
    Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more! Learn more β†’

Index

What are some of the best open-source feature-engineering projects in Python? This list will help you:

# Project Stars
1 featuretools 7,528
2 mljar-supervised 3,193
3 intelligent-trading-bot 1,474
4 functime 1,116
5 NVTabular 1,097
6 tsfel 1,038
7 evalml 824
8 pixeltable 741
9 temporian 695
10 Hyperactive 529
11 tsflex 427
12 hrv-analysis 416
13 feathub 338
14 upgini 337
15 CAAFE 168
16 NitroFE 106
17 prosto 91
18 bytehub 61
19 ds2 50
20 dpq 24
21 lambdo 24
22 HDB_Resale_Prices 23
23 dataclr 17

Sponsored
Simple auth for Python backends
Drop Civic Auth into your Python backend with just a few lines of code. Email login, SSO, and route protection built-in. Minimal config. Works with FastAPI, Flask, or Django.
www.civic.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?