Python feature-engineering

Open-source Python projects categorized as feature-engineering

Top 22 Python feature-engineering Projects

  • nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

  • featuretools

    An open source python library for automated feature engineering

  • Project mention: Featuretools – A Python Library for Automated Feature Engineering | news.ycombinator.com | 2023-09-20
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • mljar-supervised

    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

  • Project mention: Show HN: Web App with GUI for AutoML on Tabular Data | news.ycombinator.com | 2023-08-24

    Web App is using two open-source packages that I've created:

    - MLJAR AutoML - Python package for AutoML on tabular data https://github.com/mljar/mljar-supervised

    - Mercury - framework for converting Jupyter Notebooks into Web App https://github.com/mljar/mercury

    You can run Web App locally. What is more, you can adjust notebook's code for your needs. For example, you can set different validation strategies or evalutaion metrics or longer training times. The notebooks in the repo are good starting point for you to develop more advanced apps.

  • NVTabular

    NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

  • functime

    Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

  • Project mention: functime: NEW Data - star count:616.0 | /r/algoprojects | 2023-11-08
  • tsfel

    An intuitive library to extract features from time series.

  • intelligent-trading-bot

    Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

  • Project mention: TimeGPT-1 | news.ycombinator.com | 2023-10-13

    I agree that the conventional (numeric) forecasting can hardly benefit from the newest approaches like transformers and LLMs. I made such a conclusion while working on the intelligent trading bot [0] by experimenting with many ML algorithms. Yet, there exist some cases where transformers might provide significant advantages. They could be useful where the (numeric) forecasting is augmented with discrete event analysis and where sequences of events are important. Another use case is where certain patterns are important like those detected in technical analysis. Yet, for these cases much more data is needed.

    [0] https://github.com/asavinov/intelligent-trading-bot Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • evalml

    EvalML is an AutoML library written in python.

  • temporian

    Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖

  • Project mention: Temporian: Google's Python package for time series preprocessing | news.ycombinator.com | 2024-02-13
  • Hyperactive

    An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

  • Project mention: Hyperactive Version 4.5 Released | news.ycombinator.com | 2023-08-27
  • tsflex

    Flexible time series feature extraction & processing

  • hrv-analysis

    Package for Heart Rate Variability analysis in Python

  • feathub

    FeatHub - A stream-batch unified feature store for real-time machine learning

  • Project mention: FLaNK Stack Weekly for 20 June 2023 | dev.to | 2023-06-20

    Feature Hub for Flink and Spark https://github.com/alibaba/feathub

  • upgini

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  • Project mention: The fastest way to improve quality of ML model on tabular data | /r/learnmachinelearning | 2023-06-18

    web: https://upgini.com

  • NitroFE

    NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

  • prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  • bytehub

    ByteHub: making feature stores simple

  • ds2

    Easiest way to use AI models without coding (Web UI & API support) (by DS2BRAIN)

  • volga

    Feature Engine for real-time AI/ML

  • Project mention: Volga – Open-Source Feature Engine for Real-Time AI | news.ycombinator.com | 2024-04-08

    Volga is an open-source feature calculation engine for real-time ML systems aimed to remove dependency on complex Spark/Flink setups and managed feature platforms like Tecton.ai, Fennel.ai, FeatureForm, Chalk.ai

    Github - https://github.com/anovv/volga

  • lambdo

    Feature engineering and machine learning: together at last!

  • HDB_Resale_Prices

    Predicted and identified the drivers of Singapore HDB resale prices (2015-2019) with 0.96 Rsquare & $20,000 MAE. Web app deployment using Streamlit for user price prediction.

  • dpq

    dpq is an open-source python library that makes prompt-based data transformations and feature engineering easy

  • Project mention: Show HN: Dpq – a small Python library to process data using LLMs | news.ycombinator.com | 2024-04-12
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python feature-engineering related posts

Index

What are some of the best open-source feature-engineering projects in Python? This list will help you:

Project Stars
1 nni 13,726
2 featuretools 7,017
3 mljar-supervised 2,929
4 NVTabular 1,004
5 functime 891
6 tsfel 852
7 intelligent-trading-bot 737
8 evalml 709
9 temporian 619
10 Hyperactive 487
11 tsflex 360
12 hrv-analysis 342
13 feathub 293
14 upgini 290
15 NitroFE 106
16 prosto 89
17 bytehub 57
18 ds2 48
19 volga 29
20 lambdo 22
21 HDB_Resale_Prices 21
22 dpq 15

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com