Python tabular-data

Open-source Python projects categorized as tabular-data

Top 23 Python tabular-data Projects

tabular-data
  1. autogluon

    Fast and Accurate ML in 3 Lines of Code

    Project mention: AIM Weekly for 04Nov2024 | dev.to | 2024-11-04

    🌐 Composed Image Retrieval πŸ“Ž Intro to Multimodal LLama 3.2 πŸ› οΈ Multi Agent Concierge πŸ’» RAG with Langchain Granite, Milvus 🫢 Download content βœ… Transformer Replacement? πŸ€– vLLM for runing models 🌐 Amphion πŸ“ Autogluon πŸš™ Notebook LLama like Google's Notebook LLM 🫢 Monocle2ai for tracing GenAI app code LFA&D Project πŸ€– Bee Agent Framework βœ… LLama RFP Response ▢️ GenAI Script πŸ‘½ Simular AI Agent S 🦾 DrawDB with AI ✨ Ollama with LLama 3.2 Vision!!!! Preview πŸš• Powerful RAG Checker πŸ“Š SQL Generator πŸ’» Role of LLMs 🐍 Document Extraction πŸ•ΆοΈ Open Source Vector DB Reddit πŸ” The Practical Guide to Self Hosting LLM 🦾 Stagehand Controller πŸ•ΆοΈ Understanding HNSWLIB 🐍 Best practices in RAG πŸ’» Enigma Agent πŸ“ Langchain, Ollama, Phi3 for Function Calling πŸ”‹ Compass Judger πŸ“ Princeton NLP SimPO πŸ” Princeton NLP ProLong πŸ”‹ Princeton NLP HELMET 🧐 Ollama Cheatsheet πŸš• Princeton NLP CopyCat πŸ“Š Princeton NLP Shp πŸ•ΆοΈ Can LLM Solve Hard Github Issues πŸ“ Enabling Large Language Models to Generate Text with Citations πŸ”‹ Princeton NLP CharXiv πŸ“Š Awesome AI Agents List 🦾 Nomic’s Matryoshka text embedding model

  2. Judoscale

    Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.

    Judoscale logo
  3. vaex

    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second πŸš€

  4. visidata

    A terminal spreadsheet multitool for discovering and arranging data

    Project mention: Data Science at the Command Line, 2nd Edition (2021) | news.ycombinator.com | 2024-05-06

    I'd like to call out one of my favorite pieces of software from the past 10 years: VisiData [1] has completely changed the way I do ad-hoc data processing, and is now my go-to for pretty much all use cases that I previously used spreadsheets for, and about half of those I previously used databases for.

    It's a TUI application, not strictly CLI, but scriptable, and I figure anyone building pipelines using tools like jq, q, awk, grep, etc. to process tabular data will find it extremely useful.

    ----

    [1]: https://visidata.org

  5. TabPFN

    ⚑ TabPFN: Foundation Model for Tabular Data ⚑

    Project mention: What went wrong with the Alan Turing Institute? | news.ycombinator.com | 2025-03-27
  6. tabnet

    PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

  7. Auto-PyTorch

    Automatic architecture search and hyperparameter optimization for PyTorch

  8. sketch

    AI code-writing assistant that understands data content

  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

  11. CTGAN

    Conditional GAN for generating synthetic tabular data.

  12. pytorch-widedeep

    A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

  13. Transformers4Rec

    Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.

  14. tab-transformer-pytorch

    Implementation of TabTransformer, attention network for tabular data, in Pytorch

  15. rows

    A common, beautiful interface to tabular data, no matter the format

  16. Multimodal-Toolkit

    Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

  17. Copulas

    A library to model multivariate data using copulas.

  18. synthcity

    A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.

  19. saint

    The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

  20. carefree-learn

    Deep Learning ❀️ PyTorch

  21. TabFormer

    Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)

  22. tableQA

    AI Tool for querying natural language on tabular data.

  23. tabular-dl-tabr

    The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"

  24. ExtractTable-py

    Python library to extract tabular data from images and scanned PDFs

  25. SDGym

    Benchmarking synthetic data generation methods.

  26. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python tabular-data discussion

Log in or Post with

Python tabular-data related posts

  • What went wrong with the Alan Turing Institute?

    1 project | news.ycombinator.com | 27 Mar 2025
  • Ctgan: Generating synthetic data in Python using GANs

    1 project | news.ycombinator.com | 5 Feb 2024
  • [Project] AMLTK: A framework for building your own AutoML (AutoSklearn authors)

    2 projects | /r/MachineLearning | 9 Dec 2023
  • Time series data into a CNN

    1 project | /r/learnmachinelearning | 1 Mar 2023
  • Looking to switch from full-time open source to an early stage startup

    3 projects | /r/datascience | 25 Jan 2023
  • Benchmark synthetic tabular data generators using Syunthcity

    1 project | /r/learnmachinelearning | 23 Jan 2023
  • Tired of synthetic corgi?Check out Synthcity,a tool for synthetic tabular data

    2 projects | news.ycombinator.com | 19 Jan 2023
  • A note from our sponsor - Judoscale
    judoscale.com | 25 Apr 2025
    Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more β†’

Index

What are some of the best open-source tabular-data projects in Python? This list will help you:

# Project Stars
1 autogluon 8,682
2 vaex 8,373
3 visidata 8,162
4 TabPFN 3,440
5 tabnet 2,753
6 Auto-PyTorch 2,437
7 sketch 2,253
8 DataProfiler 1,479
9 CTGAN 1,374
10 pytorch-widedeep 1,340
11 Transformers4Rec 1,168
12 tab-transformer-pytorch 910
13 rows 875
14 Multimodal-Toolkit 603
15 Copulas 584
16 synthcity 540
17 saint 415
18 carefree-learn 407
19 TabFormer 331
20 tableQA 307
21 tabular-dl-tabr 291
22 ExtractTable-py 277
23 SDGym 272

Sponsored
Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com