Python feature-engineering

Open-source Python projects categorized as feature-engineering

Top 23 Python feature-engineering Projects

feature-engineering
  1. featuretools

    An open source python library for automated feature engineering

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. mljar-supervised

    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

    Project mention: Ask HN: What Are You Working On? (May 2026) | news.ycombinator.com | 2026-05-10

    Im working on AI data analyst - MLJAR Studio. It is conversational UI with AI agent which uses Python to provide data insights. It is available as desktop application https://mljar.com

  4. intelligent-trading-bot

    Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

  5. pixeltable

    Declarative and Incremental Backend for Multimodal AI Applications

    Project mention: Stop Gluing Data Infrastructure Tools: Build Multimodal AI Workloads and Application with One Declarative Python SDK | dev.to | 2025-07-06

    Star us on GitHub: https://github.com/pixeltable/pixeltable

  6. functime

    Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

  7. NVTabular

    NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

  8. tsfel

    An intuitive library to extract features from time series.

  9. evalml

    EvalML is an AutoML library written in python.

  10. temporian

    Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖

  11. Tabular-data-generation

    We well know GANs for success in the realistic image generation. However, they can be applied in tabular data generation. We will review and examine some recent papers about tabular GANs in action.

    Project mention: AI.Insaf (@ai_tablet) — Полный архив постов канала | dev.to | 2026-06-03
  12. Hyperactive

    A unified interface for optimization algorithms and experiments

  13. hrv-analysis

    Package for Heart Rate Variability analysis in Python

  14. tsflex

    Flexible time series feature extraction & processing

  15. upgini

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  16. feathub

    FeatHub - A stream-batch unified feature store for real-time machine learning

  17. CAAFE

    Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).

  18. NitroFE

    NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

  19. prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  20. bytehub

    ByteHub: making feature stores simple

  21. ds2

    Easiest way to use AI models without coding (Web UI & API support) (by DS2BRAIN)

  22. Skyulf

    Build and ship production ML pipelines faster: a pipeline library with an optional self-hosted visual layer for modular, reproducible workflows, local testing, and experiment tracking.

    Project mention: Show HN: I built a visual, MLOps tool (Skyulf) | news.ycombinator.com | 2026-01-06
  23. social-media-ai-engineering-etl

    Real-world AI engineering dataset creation, SFT fine-tuning, and GRPO alignment ETL pipeline.

    Project mention: Real-world dataset creation, SFT fine-tuning, and GRPO alignment pipeline | news.ycombinator.com | 2025-08-28
  24. dpq

    dpq is an open-source python library that makes prompt-based data transformations and feature engineering easy

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python feature-engineering discussion

Log in or Post with

Python feature-engineering related posts

Index

What are some of the best open-source feature-engineering projects in Python? This list will help you:

# Project Stars
1 featuretools 7,655
2 mljar-supervised 3,265
3 intelligent-trading-bot 1,705
4 pixeltable 1,568
5 functime 1,178
6 NVTabular 1,146
7 tsfel 1,094
8 evalml 849
9 temporian 712
10 Tabular-data-generation 570
11 Hyperactive 550
12 hrv-analysis 445
13 tsflex 440
14 upgini 350
15 feathub 349
16 CAAFE 182
17 NitroFE 108
18 prosto 93
19 bytehub 61
20 ds2 50
21 Skyulf 44
22 social-media-ai-engineering-etl 34
23 dpq 25

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?