Python feature-engineering

Open-source Python projects categorized as feature-engineering

Top 23 Python feature-engineering Projects

feature-engineering
  1. tpot

    A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

    Project mention: Evolve Your Machine Learning: Automate the Process of Model Selection through TPOT. | dev.to | 2024-07-06

    Resources: TPOT Documentation Genetic Programming

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. featuretools

    An open source python library for automated feature engineering

  4. mljar-supervised

    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

    Project mention: Show HN: Supertree – interactive visualization of decision trees in Python | news.ycombinator.com | 2024-08-27

    We would like to keep package sustainable. Earlier, we've created package for AutoML which is MIT license (https://github.com/mljar/mljar-supervised), and it is very hard to monetise it, and you need to have funds to keep package maintained and work on it.

    Regarding purchasing, we just don't have time create landing page with buy button :) we will add it soon. The package cost will be 499 USD/yearly. We already have few finance companies interested.

  5. intelligent-trading-bot

    Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

    Project mention: Show HN: High-Frequency Trading and Market-Making Backtesting Tool with Examples | news.ycombinator.com | 2024-06-21

    You could try a tool for trade signal generation based on machine learning and feature engineering:

    https://github.com/asavinov/intelligent-trading-bot

    It trains ML models based on historic data and custom features and then uses them to generate a kind of intelligent indicator between -1 and +1. This intelligent indicator is then used to make trade decisions. Frequency is a parameter and can very from 1 minute for crypto trading to 1 day for normal exchanges.

  6. functime

    Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

  7. NVTabular

    NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

  8. tsfel

    An intuitive library to extract features from time series.

  9. evalml

    EvalML is an AutoML library written in python.

    Project mention: 10 Open Source MLOps Projects You Didn’t Know About | dev.to | 2024-08-01

    EvalML Hyperparameter tuning and evaluating ML models are integral aspects of ML product development. EvalML is an AutoML library that aims to ease the process of building, optimizing, and evaluating ML models by helping engineers avoid manual training and tuning of models. It also includes data quality checks and cross-validation.

  10. temporian

    Temporian is an open-source Python library for preprocessing ⚑ and feature engineering πŸ›  temporal data πŸ“ˆ for machine learning applications πŸ€–

    Project mention: Temporian: Google's Python package for time series preprocessing | news.ycombinator.com | 2024-02-13
  11. Hyperactive

    An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

  12. tsflex

    Flexible time series feature extraction & processing

  13. hrv-analysis

    Package for Heart Rate Variability analysis in Python

  14. upgini

    Data search & enrichment library for Machine Learning β†’ Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  15. feathub

    FeatHub - A stream-batch unified feature store for real-time machine learning

  16. pixeltable

    Pixeltable β€” AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.

    Project mention: Pixeltable: Store, transform, index, and iterate on data for ML | news.ycombinator.com | 2024-12-17
  17. CAAFE

    Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, MΓΌller, and Hutter (2023).

  18. NitroFE

    NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

  19. prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  20. bytehub

    ByteHub: making feature stores simple

  21. ds2

    Easiest way to use AI models without coding (Web UI & API support) (by DS2BRAIN)

  22. volga

    Data Processing/Feature Calculation Engine for real-time AI/ML

    Project mention: Volga – Open-Source Feature Engine for Real-Time AI | news.ycombinator.com | 2024-04-08

    Volga is an open-source feature calculation engine for real-time ML systems aimed to remove dependency on complex Spark/Flink setups and managed feature platforms like Tecton.ai, Fennel.ai, FeatureForm, Chalk.ai

    Github - https://github.com/anovv/volga

  23. dpq

    dpq is an open-source python library that makes prompt-based data transformations and feature engineering easy

    Project mention: Show HN: Dpq – a small Python library to process data using LLMs | news.ycombinator.com | 2024-04-12
  24. HDB_Resale_Prices

    Predicted and identified the drivers of Singapore HDB resale prices (2015-2019) with 0.96 Rsquare & $20,000 MAE. Web app deployment using Streamlit for user price prediction.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python feature-engineering discussion

Log in or Post with

Python feature-engineering related posts

Index

What are some of the best open-source feature-engineering projects in Python? This list will help you:

# Project Stars
1 tpot 9,806
2 featuretools 7,336
3 mljar-supervised 3,094
4 intelligent-trading-bot 1,174
5 functime 1,066
6 NVTabular 1,064
7 tsfel 960
8 evalml 799
9 temporian 685
10 Hyperactive 515
11 tsflex 408
12 hrv-analysis 399
13 upgini 322
14 feathub 317
15 pixeltable 141
16 CAAFE 140
17 NitroFE 106
18 prosto 91
19 bytehub 58
20 ds2 48
21 volga 40
22 dpq 25
23 HDB_Resale_Prices 23

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com