SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Panda Projects
-
Pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Project mention: Building a Sarcasm Detection System with LSTM and GloVe: A Complete Guide | dev.to | 2025-01-02Pandas
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
30-Days-Of-Python
30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw
4. Asabeneh/30-Days-Of-Python - This repository presents a 30-day challenge for beginners to learn Python from the ground up. The course covers everything from the basics to more advanced topics like statistics, data analysis, and web development. https://github.com/Asabeneh/30-Days-Of-Python
-
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Project mention: 20 Open Source Tools I Recommend to Build, Share, and Run AI Projects | dev.to | 2024-11-13Datasets library repository for accessing and sharing datasets with the community.
-
-
pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
In this blog, we will build a powerful IDE agent for PandasAI using Dash Agent. Then later on, we'll understand how using RAG can significantly improve LLM responses.
-
Project mention: A simple way to explore data through a Tableau-like UI directly in your data app | news.ycombinator.com | 2024-12-30
I believe this is just a wrapper around pygwalker, which is a nice project: https://github.com/Kanaries/pygwalker
I really like the typescript graphic walker: https://github.com/Kanaries/graphic-walker
-
From what I've seen, there are sort of two paths. I'll provide a well known example from each.
1. lang specific distributed task library
For example, in Python, celery is a pretty popular task system. If you (the dev) are the one doing all the code and running the workflows, it might work well for you. You build the core code and functions, and it handles the processing and resource stuff with a little config.
* https://github.com/celery/celery
Or lower level:
* https://github.com/dask/dask
2. DAG Workflow systems
There are also whole systems for what you're describing. They've gotten especially popular in the ML ops and data engineering world. A common one is AirFlow:
* https://github.com/apache/airflow
-
seaborn
-
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
-
-
-
Project mention: Data Science at the Command Line, 2nd Edition (2021) | news.ycombinator.com | 2024-05-06
I'd like to call out one of my favorite pieces of software from the past 10 years: VisiData [1] has completely changed the way I do ad-hoc data processing, and is now my go-to for pretty much all use cases that I previously used spreadsheets for, and about half of those I previously used databases for.
It's a TUI application, not strictly CLI, but scriptable, and I figure anyone building pipelines using tools like jq, q, awk, grep, etc. to process tabular data will find it extremely useful.
----
[1]: https://visidata.org
-
pandas-ta
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
-
-
-
I know I've tooted its horn before, but Orange3 is a pretty neat Python-based GUI platform that makes this and a metric buttload of other statistical/ML techniques available to non-programmer types.
Just watch out for null character `x00` in the corpus. That always seems to kill it stone dead.
https://orangedatamining.com/
https://orange3.readthedocs.io/projects/orange-visual-progra...
-
Project mention: Rivian GeoLocation Plotting with IRIS Cloud Document and Databricks | dev.to | 2024-12-26
We are using geopandas and geodatasets for a straight forward approach to plotting.
-
Mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
-
-
-
Python Pandas discussion
Python Pandas related posts
-
Fixing timestamp overflow error in Python
-
Rivian GeoLocation Plotting with IRIS Cloud Document and Databricks
-
Build a Competitive Intelligence Tool Powered by AI
-
FireDucks: Pandas but 100x Faster
-
The Polars vs. Pandas difference nobody is talking about – Labs
-
DuckDB over Pandas/Polars
-
How to Use Lambda Functions in Python
-
A note from our sponsor - SaaSHub
www.saashub.com | 17 Jan 2025
Index
What are some of the best open-source Panda projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | Pandas | 44,267 |
2 | 30-Days-Of-Python | 43,951 |
3 | tqdm | 29,050 |
4 | data-science-ipython-notebooks | 27,721 |
5 | datasets | 19,470 |
6 | yfinance | 15,414 |
7 | pandas-ai | 13,970 |
8 | pygwalker | 13,701 |
9 | Dask | 12,857 |
10 | seaborn | 12,743 |
11 | ydata-profiling | 12,634 |
12 | modin | 9,980 |
13 | mlcourse.ai | 9,862 |
14 | visidata | 7,995 |
15 | pandas-ta | 5,637 |
16 | ibis | 5,434 |
17 | lux | 5,240 |
18 | orange | 4,946 |
19 | geopandas | 4,599 |
20 | Mimesis | 4,474 |
21 | alpha_vantage | 4,350 |
22 | pytorch-forecasting | 4,063 |
23 | missingno | 3,999 |