Python Pandas

Open-source Python projects categorized as Pandas

Top 23 Python Panda Projects

  • Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Project mention: Feeling like a fraud because I simply can't adequately understand the engineering spreadsheets' macros. | reddit.com/r/AskEngineers | 2023-02-09

    I to this day don’t know how to use excel, so I turned to Python and Pandas

  • data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • tqdm

    A Fast, Extensible Progress Bar for Python and CLI

    Project mention: I keep getting this issue, can anyone help?? | reddit.com/r/blender | 2023-02-06

    you try to run an python script that requires the tqdm package and also a regex package (what normally should be installed, when installing python). Blender tries to install these packages without success. You probably have to do it on your own by installing them in your pythons virtual environment.

  • datasets

    🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

    Project mention: Need help with a data science project | reddit.com/r/learnmachinelearning | 2023-01-30
  • Dask

    Parallel computing with task scheduling

    Project mention: A peek into Location Data Science at Ola | dev.to | 2022-09-26

    Data scientists work on phenomenally large datasets, and Dask is a handy tool for exploration within the confines of a single cloud VM or their local PCs. Location data visualization is an essential part of deciding further algorithm development and roadmap for projects. This lays the foundation for data engineering and science to work at scale, with petabytes of data.

  • seaborn

    Statistical data visualization in Python

    Project mention: The Python Packages That Gave Me Nightmares: A Guide to Overcoming Common Challenges | dev.to | 2023-02-08

    Seaborn: Seaborn is a data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. However, it can be difficult to integrate with other libraries and customize the visualizations to your specific needs. GitHub - https://github.com/mwaskom/seaborn

  • ydata-profiling

    Create HTML profiling reports from pandas DataFrame objects

    Project mention: pandas-profiling VS Rath - a user suggested alternative | libhunt.com/r/pandas-profiling | 2023-01-12
  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • yfinance

    Download market data from Yahoo! Finance's API

    Project mention: If you bought every stock removed from the ASX200 in 2020, you would've outperformed the ASX200 by 55% over the last two years. You would also have outperformed someone who bought every stock added to the ASX200 by 85%. | reddit.com/r/AusFinance | 2022-12-30

    From the S&P Global rebalancing announcements, I pulled every company that was added and removed from the ASX200 in 2020 and made two separate portfolios of even weighting of all stocks. I then backtested, including dividends, the performance of both across the past two years with yfinance.

  • mlcourse.ai

    Open Machine Learning Course

    Project mention: mlcourse.ai: NEW Courses - star count:8584.0 | reddit.com/r/algoprojects | 2023-02-04
  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

    Project mention: Modern Polars: an extensive side-by-side comparison of Polars and Pandas | news.ycombinator.com | 2023-01-07

    Yeah, tried Polars a couple of times: the API seems worse than Pandas to me too. eg the decision only to support autoincrementing integer indexes seems like it would make debugging "hmmm, that answer is wrong, what exactly did I select?" bugs much more annoying. Polars docs write "blazingly fast" all over them but I doubt that is a compelling point for people using single-node dataframe libraries. It isn't for me.

    Modin (https://github.com/modin-project/modin) seems more promising at this point, particularly since a migration path for standing Pandas code is highly desirable.

  • visidata

    A terminal spreadsheet multitool for discovering and arranging data

    Project mention: Terminal Based Programs? | reddit.com/r/linuxquestions | 2022-12-30

    VisiData is an awesome terminal spreadsheet tool. edbrowse for internet browsing.

  • lux

    Automatically visualize your pandas dataframe via a single print! 📊 💡 (by lux-org)

    Project mention: Name of library that creates multille charts quickly | reddit.com/r/learnpython | 2023-02-04
  • orange

    🍊 :bar_chart: :bulb: Orange: Interactive data analysis

    Project mention: Statistical Analysis software based on Python? | reddit.com/r/Python | 2023-01-28

    Only thing I can think of is Orange, which has some statistics capability, but isn't its focus.

  • alpha_vantage

    A python wrapper for Alpha Vantage API for financial data.

    Project mention: alpha_vantage: NEW Data - star count:3782.0 | reddit.com/r/algoprojects | 2022-12-10
  • missingno

    Missing data visualization module for Python.

    Project mention: #VisualizationTip: Using Seaborn(Heatmap) to visualize Missing data( Yellow- Representation of a column's missing data.) | reddit.com/r/datascience | 2022-10-04

    Good job, but I would recommend missingno it's a powerful module for missing values visualization.

  • Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

    A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.

    Project mention: Cool Github repositories for Everyone | dev.to | 2022-12-29
  • AWS Data Wrangler

    pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

    Project mention: I agree that Arrow Tables are great, but we decided to keep the library focused on the Pandas interface. [wont implement] | reddit.com/r/programmingcirclejerk | 2022-09-21
  • pandas-ta

    Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

    Project mention: How to detect price action? | reddit.com/r/algotrading | 2023-01-13
  • koalas

    Koalas: pandas API on Apache Spark

    Project mention: My new company uses Pyspark. I want to learn it before my starting date. Any advice? | reddit.com/r/datascience | 2022-11-10

    If they're using databricks and you're familiar with pandas, koalas should be right up your alley .

  • XlsxWriter

    A Python module for creating Excel XLSX files.

    Project mention: Streamlining Data Export to Excel: A comprehensive guide to using Python, Nodejs, PHP. | dev.to | 2023-01-11

    openpyxl

  • arctic

    High performance datastore for time series and tick data

    Project mention: arctic: NEW Data - star count:2864.0 | reddit.com/r/algoprojects | 2023-01-14
  • pandarallel

    A simple and efficient tool to parallelize Pandas operations on all available CPUs

    Project mention: Meet Pandaral.lel - a tool that makes it easy to utilize all your CPUs for faster pandas operations. With just one line of code, you can parallelize your pandas work and even track progress with a progress bar. | reddit.com/r/machinelearningnews | 2023-01-03

    Code: https://github.com/nalepae/pandarallel

  • xarray

    N-D labeled arrays and datasets in Python

    Project mention: Request for Startups: Climate Tech | news.ycombinator.com | 2022-12-15

    PyTorch and JAX are used heavily in climate science on the ML side. For more general analytics, not so much. Many of our users like to use Xarray as a high-level API. There has been some work to integrate Xarray with PyTorch (https://github.com/pydata/xarray/issues/3232) but we're not there yet.

    The Python Array API standard should help align these different back-ends: https://data-apis.org/array-api/latest/

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-02-09.

Python Pandas related posts

Index

What are some of the best open-source Panda projects in Python? This list will help you:

Project Stars
1 Pandas 36,782
2 data-science-ipython-notebooks 24,602
3 tqdm 23,904
4 datasets 15,143
5 Dask 10,716
6 seaborn 10,308
7 ydata-profiling 10,106
8 yfinance 8,824
9 mlcourse.ai 8,599
10 modin 8,347
11 visidata 6,313
12 lux 4,409
13 orange 3,929
14 alpha_vantage 3,828
15 missingno 3,445
16 Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials 3,337
17 AWS Data Wrangler 3,305
18 pandas-ta 3,283
19 koalas 3,247
20 XlsxWriter 3,150
21 arctic 2,887
22 pandarallel 2,866
23 xarray 2,851
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com