Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more β
Top 23 Python tabular-data Projects
-
π Composed Image Retrieval π Intro to Multimodal LLama 3.2 π οΈ Multi Agent Concierge π» RAG with Langchain Granite, Milvus π«Ά Download content β Transformer Replacement? π€ vLLM for runing models π Amphion π Autogluon π Notebook LLama like Google's Notebook LLM π«Ά Monocle2ai for tracing GenAI app code LFA&D Project π€ Bee Agent Framework β LLama RFP Response βΆοΈ GenAI Script π½ Simular AI Agent S π¦Ύ DrawDB with AI β¨ Ollama with LLama 3.2 Vision!!!! Preview π Powerful RAG Checker π SQL Generator π» Role of LLMs π Document Extraction πΆοΈ Open Source Vector DB Reddit π The Practical Guide to Self Hosting LLM π¦Ύ Stagehand Controller πΆοΈ Understanding HNSWLIB π Best practices in RAG π» Enigma Agent π Langchain, Ollama, Phi3 for Function Calling π Compass Judger π Princeton NLP SimPO π Princeton NLP ProLong π Princeton NLP HELMET π§ Ollama Cheatsheet π Princeton NLP CopyCat π Princeton NLP Shp πΆοΈ Can LLM Solve Hard Github Issues π Enabling Large Language Models to Generate Text with Citations π Princeton NLP CharXiv π Awesome AI Agents List π¦Ύ Nomicβs Matryoshka text embedding model
-
Judoscale
Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
-
vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second π
-
Project mention: Data Science at the Command Line, 2nd Edition (2021) | news.ycombinator.com | 2024-05-06
I'd like to call out one of my favorite pieces of software from the past 10 years: VisiData [1] has completely changed the way I do ad-hoc data processing, and is now my go-to for pretty much all use cases that I previously used spreadsheets for, and about half of those I previously used databases for.
It's a TUI application, not strictly CLI, but scriptable, and I figure anyone building pipelines using tools like jq, q, awk, grep, etc. to process tabular data will find it extremely useful.
----
[1]: https://visidata.org
-
Project mention: What went wrong with the Alan Turing Institute? | news.ycombinator.com | 2025-03-27
-
-
-
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
-
-
pytorch-widedeep
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
-
Transformers4Rec
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
-
tab-transformer-pytorch
Implementation of TabTransformer, attention network for tabular data, in Pytorch
-
-
Multimodal-Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
-
-
synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
-
saint
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training
-
-
TabFormer
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
-
-
tabular-dl-tabr
The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
-
-
-
InfluxDB
InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
Python tabular-data discussion
Python tabular-data related posts
-
What went wrong with the Alan Turing Institute?
-
Ctgan: Generating synthetic data in Python using GANs
-
[Project] AMLTK: A framework for building your own AutoML (AutoSklearn authors)
-
Time series data into a CNN
-
Looking to switch from full-time open source to an early stage startup
-
Benchmark synthetic tabular data generators using Syunthcity
-
Tired of synthetic corgi?Check out Synthcity,a tool for synthetic tabular data
-
A note from our sponsor - Judoscale
judoscale.com | 25 Apr 2025
Index
What are some of the best open-source tabular-data projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | autogluon | 8,682 |
2 | vaex | 8,373 |
3 | visidata | 8,162 |
4 | TabPFN | 3,440 |
5 | tabnet | 2,753 |
6 | Auto-PyTorch | 2,437 |
7 | sketch | 2,253 |
8 | DataProfiler | 1,479 |
9 | CTGAN | 1,374 |
10 | pytorch-widedeep | 1,340 |
11 | Transformers4Rec | 1,168 |
12 | tab-transformer-pytorch | 910 |
13 | rows | 875 |
14 | Multimodal-Toolkit | 603 |
15 | Copulas | 584 |
16 | synthcity | 540 |
17 | saint | 415 |
18 | carefree-learn | 407 |
19 | TabFormer | 331 |
20 | tableQA | 307 |
21 | tabular-dl-tabr | 291 |
22 | ExtractTable-py | 277 |
23 | SDGym | 272 |