Top 12 Python preprocessing Projects
-
ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
-
igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
NVTabular
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
-
pytorch-VideoDataset
Tools for loading video dataset and transforms on video in pytorch. You can directly load video files without preprocessing.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
courlan
Clean, filter and sample URLs to optimize data collection – includes spam, content type and language filters
Project mention: RAGFlow is an open-source RAG engine based on deep document understanding | news.ycombinator.com | 2024-04-01Just link them to https://github.com/infiniflow/ragflow/blob/main/rag/llm/chat... :)
Python preprocessing related posts
Index
What are some of the best open-source preprocessing projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | ragflow | 5,516 |
2 | igel | 3,080 |
3 | MLBox | 1,475 |
4 | NVTabular | 1,004 |
5 | nnAudio | 953 |
6 | voicesmith | 207 |
7 | pytorch-VideoDataset | 67 |
8 | courlan | 65 |
9 | podium | 60 |
10 | cpip | 38 |
11 | VHDLproc | 24 |
12 | riffusion-scripts | 0 |
Sponsored