Python Pandas

Open-source Python projects categorized as Pandas

Top 23 Python Panda Projects

  • Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Project mention: Help Us Build Our Roadmap – Pydantic | news.ycombinator.com | 2024-02-19

    there is pull request to integrate in both pydantic extra types and into pandas cose [1]

    [1]: https://github.com/pandas-dev/pandas/issues/53999

  • 30-Days-Of-Python

    30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw

    Project mention: Top GitHub Resources to Level Up Your Python game | dev.to | 2023-11-27

    🎇 Repository Link: 30 Days of Python

  • Onboard AI

    ChatGPT with full context of any GitHub repo. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at app.getonboardai.com.

  • tqdm

    :zap: A Fast, Extensible Progress Bar for Python and CLI

    Project mention: Neat Parallel Output in Python | news.ycombinator.com | 2024-02-25

    yeah my code needs to use multiprocessing, which does not play nice with tqdm. thanks for the tip about positions though, that helped me search more effectively and came up with two promising comments. unmerged / require some workarounds, but might just work:

    https://github.com/tqdm/tqdm/issues/1000#issuecomment-184208...

  • data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • datasets

    🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

    Project mention: 🐍🐍 23 issues to grow yourself as an exceptional open-source Python expert 🧑‍💻 🥇 | dev.to | 2023-10-19
  • ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

    Project mention: FLaNK 25 December 2023 | dev.to | 2023-12-26
  • Dask

    Parallel computing with task scheduling

    Project mention: The Distributed Tensor Algebra Compiler (2022) | news.ycombinator.com | 2023-06-15
  • WorkOS

    The modern API for authentication & user identity. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • seaborn

    Statistical data visualization in Python

    Project mention: Seaborn bug responsible for finding of declining disruptiveness in science | news.ycombinator.com | 2024-02-25

    It's referring to the seaborn library (https://seaborn.pydata.org/), a Python library for data visualization (built on top of matplotlib).

  • yfinance

    Download market data from Yahoo! Finance's API

    Project mention: How to catch exceptions in library? | /r/learnpython | 2023-07-06

    If you check the file here - https://github.com/ranaroussi/yfinance/blob/main/yfinance/base.py - you can see this is communicated via the "raise Exception('%s: %s' % (self.ticker, err_msg))" line. I'm trying to use the following to catch the exception but no luck.

  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

    Project mention: The Distributed Tensor Algebra Compiler (2022) | news.ycombinator.com | 2023-06-15
  • mlcourse.ai

    Open Machine Learning Course

    Project mention: Open Machine Learning Course | news.ycombinator.com | 2023-10-22
  • pygwalker

    PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

    Project mention: Show HN: Data Painter – different way to interact with data in Jupyter notebook | news.ycombinator.com | 2024-01-02
  • visidata

    A terminal spreadsheet multitool for discovering and arranging data

    Project mention: Fx – Terminal JSON Viewer | news.ycombinator.com | 2023-09-19

    [4] "Is it possible to "flatten" structured data (like JSON?)": https://github.com/saulpw/visidata/discussions/1605

  • lux

    Automatically visualize your pandas dataframe via a single print! 📊 💡 (by lux-org)

  • pandas-ta

    Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

    Project mention: Help recreating ta-lib python MACDFIX in pure python | /r/algotrading | 2023-05-03

    I do not know what is the difference between MACD and MACDFIX but maybe you can take a look how MACD is implemented in pandas_ta library and modify it a bit to achive a behavior you want.

  • orange

    🍊 :bar_chart: :bulb: Orange: Interactive data analysis

    Project mention: Taxonomy Management? | /r/technicalwriting | 2023-12-05

    First is identifying the "similar" things in a corpus. Best way I know to do that, for non-programmer audiences, is the Orange Data Mining tool, which gives you a node-based text mining interface to perform statistical analysis on text. Hierarchical Clustering shows - very rapidly - how similar your "modules" are, which ones are most similar. There's many other techniques (semantic viewer, similarity hash, etc) as well - the right one will depend on how your content is laying about.

  • Mimesis

    Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.

    Project mention: Mimesis allows you toeasily generate detailed dummy datasets. | /r/datascience | 2023-04-12

    Mimesis has well-structured and comprehensive documentation: https://mimesis.name

  • alpha_vantage

    A python wrapper for Alpha Vantage API for financial data.

    Project mention: alpha_vantage: NEW Data - star count:3917.0 | /r/algoprojects | 2023-05-15
  • geopandas

    Python tools for geographic data

    Project mention: Geopandas spatial predicate performance increase? | /r/gis | 2023-05-12

    Interesting! Full honesty I don’t know, lol. But if I were to guess… something to do with implementation of pygeos and indexing. Recent geopandas versions I believe handle joins with shapely vs. pygeos differently. Maybe if there is a declared predicate, it treats the spatial index differently. https://github.com/geopandas/geopandas/pull/1421

  • missingno

    Missing data visualization module for Python.

  • ibis

    The flexibility of Python with the scale and performance of modern SQL.

    Project mention: Ibis: The portable Python dataframe library | news.ycombinator.com | 2024-02-22
  • AWS Data Wrangler

    pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

    Project mention: Read files from s3 using Pandas/s3fs or AWS Data Wrangler? | /r/dataengineering | 2023-12-06

    I had no problem with awswrangler (https://github.com/aws/aws-sdk-pandas) and it supports reading and writing partitions which was really helpful and a few other optimizations that made it a great tool

  • Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

    A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-25.

Python Pandas related posts

Index

What are some of the best open-source Panda projects in Python? This list will help you:

Project Stars
1 Pandas 41,332
2 30-Days-Of-Python 29,149
3 tqdm 27,043
4 data-science-ipython-notebooks 26,140
5 datasets 18,064
6 ydata-profiling 11,837
7 Dask 11,806
8 seaborn 11,721
9 yfinance 11,344
10 modin 9,332
11 mlcourse.ai 9,304
12 pygwalker 8,892
13 visidata 7,284
14 lux 4,870
15 pandas-ta 4,527
16 orange 4,525
17 Mimesis 4,226
18 alpha_vantage 4,133
19 geopandas 4,067
20 missingno 3,771
21 ibis 3,751
22 AWS Data Wrangler 3,745
23 Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials 3,599
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com