Python Pandas

Open-source Python projects categorized as Pandas

Top 23 Python Panda Projects

  1. Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Project mention: Building a Sarcasm Detection System with LSTM and GloVe: A Complete Guide | dev.to | 2025-01-02

    Pandas

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. 30-Days-Of-Python

    30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw

    Project mention: Top 10 GitHub Repositories for Python and Java Developers | dev.to | 2024-05-03

    4. Asabeneh/30-Days-Of-Python - This repository presents a 30-day challenge for beginners to learn Python from the ground up. The course covers everything from the basics to more advanced topics like statistics, data analysis, and web development. https://github.com/Asabeneh/30-Days-Of-Python

  4. tqdm

    :zap: A Fast, Extensible Progress Bar for Python and CLI

    Project mention: FLaNK-AIM: 20 May 2024 Weekly | dev.to | 2024-05-20
  5. data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  6. datasets

    🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

    Project mention: 20 Open Source Tools I Recommend to Build, Share, and Run AI Projects | dev.to | 2024-11-13

    Datasets library repository for accessing and sharing datasets with the community.

  7. yfinance

    Download market data from Yahoo! Finance's API

  8. pandas-ai

    Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

    Project mention: Using RAG to Build Your IDE Agents | dev.to | 2024-06-18

    In this blog, we will build a powerful IDE agent for PandasAI using Dash Agent. Then later on, we'll understand how using RAG can significantly improve LLM responses.

  9. pygwalker

    PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

    Project mention: A simple way to explore data through a Tableau-like UI directly in your data app | news.ycombinator.com | 2024-12-30

    I believe this is just a wrapper around pygwalker, which is a nice project: https://github.com/Kanaries/pygwalker

    I really like the typescript graphic walker: https://github.com/Kanaries/graphic-walker

  10. Dask

    Parallel computing with task scheduling

    Project mention: Ask HN: What's the right tool for this job? | news.ycombinator.com | 2024-07-20

    From what I've seen, there are sort of two paths. I'll provide a well known example from each.

    1. lang specific distributed task library

    For example, in Python, celery is a pretty popular task system. If you (the dev) are the one doing all the code and running the workflows, it might work well for you. You build the core code and functions, and it handles the processing and resource stuff with a little config.

    * https://github.com/celery/celery

    Or lower level:

    * https://github.com/dask/dask

    2. DAG Workflow systems

    There are also whole systems for what you're describing. They've gotten especially popular in the ML ops and data engineering world. A common one is AirFlow:

    * https://github.com/apache/airflow

  11. seaborn

    Statistical data visualization in Python

    Project mention: 1MinDocker #6 - Building further | dev.to | 2024-11-11

    seaborn

  12. ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

  13. modin

    Modin: Scale your Pandas workflows by changing a single line of code

  14. mlcourse.ai

    Open Machine Learning Course

  15. visidata

    A terminal spreadsheet multitool for discovering and arranging data

    Project mention: Data Science at the Command Line, 2nd Edition (2021) | news.ycombinator.com | 2024-05-06

    I'd like to call out one of my favorite pieces of software from the past 10 years: VisiData [1] has completely changed the way I do ad-hoc data processing, and is now my go-to for pretty much all use cases that I previously used spreadsheets for, and about half of those I previously used databases for.

    It's a TUI application, not strictly CLI, but scriptable, and I figure anyone building pipelines using tools like jq, q, awk, grep, etc. to process tabular data will find it extremely useful.

    ----

    [1]: https://visidata.org

  16. pandas-ta

    Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators

  17. ibis

    the portable Python dataframe library

    Project mention: FireDucks: Pandas but 100x Faster | news.ycombinator.com | 2024-11-20
  18. lux

    Automatically visualize your pandas dataframe via a single print! 📊 💡 (by lux-org)

  19. orange

    🍊 :bar_chart: :bulb: Orange: Interactive data analysis

    Project mention: Hierarchical Clustering | news.ycombinator.com | 2024-04-20

    I know I've tooted its horn before, but Orange3 is a pretty neat Python-based GUI platform that makes this and a metric buttload of other statistical/ML techniques available to non-programmer types.

    Just watch out for null character `x00` in the corpus. That always seems to kill it stone dead.

    https://orangedatamining.com/

    https://orange3.readthedocs.io/projects/orange-visual-progra...

  20. geopandas

    Python tools for geographic data

    Project mention: Rivian GeoLocation Plotting with IRIS Cloud Document and Databricks | dev.to | 2024-12-26

    We are using geopandas and geodatasets for a straight forward approach to plotting.

  21. Mimesis

    Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.

  22. alpha_vantage

    A python wrapper for Alpha Vantage API for financial data.

  23. pytorch-forecasting

    Time series forecasting with PyTorch

  24. missingno

    Missing data visualization module for Python.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Pandas discussion

Log in or Post with

Python Pandas related posts

  • Fixing timestamp overflow error in Python

    1 project | news.ycombinator.com | 30 Dec 2024
  • Rivian GeoLocation Plotting with IRIS Cloud Document and Databricks

    3 projects | dev.to | 26 Dec 2024
  • Build a Competitive Intelligence Tool Powered by AI

    3 projects | dev.to | 29 Nov 2024
  • FireDucks: Pandas but 100x Faster

    4 projects | news.ycombinator.com | 20 Nov 2024
  • The Polars vs. Pandas difference nobody is talking about – Labs

    1 project | news.ycombinator.com | 16 Nov 2024
  • DuckDB over Pandas/Polars

    2 projects | news.ycombinator.com | 5 Nov 2024
  • How to Use Lambda Functions in Python

    1 project | dev.to | 30 Oct 2024
  • A note from our sponsor - SaaSHub
    www.saashub.com | 17 Jan 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Panda projects in Python? This list will help you:

# Project Stars
1 Pandas 44,267
2 30-Days-Of-Python 43,951
3 tqdm 29,050
4 data-science-ipython-notebooks 27,721
5 datasets 19,470
6 yfinance 15,414
7 pandas-ai 13,970
8 pygwalker 13,701
9 Dask 12,857
10 seaborn 12,743
11 ydata-profiling 12,634
12 modin 9,980
13 mlcourse.ai 9,862
14 visidata 7,995
15 pandas-ta 5,637
16 ibis 5,434
17 lux 5,240
18 orange 4,946
19 geopandas 4,599
20 Mimesis 4,474
21 alpha_vantage 4,350
22 pytorch-forecasting 4,063
23 missingno 3,999

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?