Python preprocessing

Open-source Python projects categorized as preprocessing

Top 12 Python preprocessing Projects

  • ragflow

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

  • Project mention: RAGFlow is an open-source RAG engine based on deep document understanding | news.ycombinator.com | 2024-04-01

    Just link them to https://github.com/infiniflow/ragflow/blob/main/rag/llm/chat... :)

  • igel

    a delightful machine learning tool that allows you to train, test, and use models without writing code

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • MLBox

    MLBox is a powerful Automated Machine Learning python library.

  • NVTabular

    NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

  • nnAudio

    Audio processing by using pytorch 1D convolution network

  • voicesmith

    [WIP] VoiceSmith makes training text to speech models easy.

  • pytorch-VideoDataset

    Tools for loading video dataset and transforms on video in pytorch. You can directly load video files without preprocessing.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • courlan

    Clean, filter and sample URLs to optimize data collection – includes spam, content type and language filters

  • podium

    Podium: a framework agnostic Python NLP library for data loading and preprocessing

  • cpip

    CPIP - a C/C++ preprocessor implemented in Python.

  • VHDLproc

    VHDLproc is a VHDL preprocessor

  • riffusion-scripts

    Scripts to aid in preparing training data for riffusion.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python preprocessing related posts

Index

What are some of the best open-source preprocessing projects in Python? This list will help you:

Project Stars
1 ragflow 5,516
2 igel 3,080
3 MLBox 1,475
4 NVTabular 1,004
5 nnAudio 953
6 voicesmith 207
7 pytorch-VideoDataset 67
8 courlan 65
9 podium 60
10 cpip 38
11 VHDLproc 24
12 riffusion-scripts 0

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com