Python Dataframe

Open-source Python projects categorized as Dataframe

Top 22 Python Dataframe Projects

  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

    Project mention: A Polars exploration into Kedro | dev.to | 2023-05-17

    The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.

  • vaex

    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • pandas-ta

    Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

    Project mention: Help recreating ta-lib python MACDFIX in pure python | /r/algotrading | 2023-05-03

    I do not know what is the difference between MACD and MACDFIX but maybe you can take a look how MACD is implemented in pandas_ta library and modify it a bit to achive a behavior you want.

  • koalas

    Koalas: pandas API on Apache Spark

    Project mention: My new company uses Pyspark. I want to learn it before my starting date. Any advice? | /r/datascience | 2022-11-10

    If they're using databricks and you're familiar with pandas, koalas should be right up your alley .

  • PandasGUI

    A GUI for Pandas DataFrames

    Project mention: GUI for a Dynamically Created Dataframe | /r/learnpython | 2023-01-31

    This works with plotly but does a lot on its own if visualization isn’t the only thing you need, https://github.com/adamerose/PandasGUI

  • mars

    Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

  • sketch

    AI code-writing assistant that understands data content

    Project mention: Pandas AI – The Future of Data Analysis | news.ycombinator.com | 2023-05-17

    This morning I added a "Related Projects" [3] Section to the Buckaroo docs. If Buckaroo doesn't solve your problem, look at one of the other linked projects (like Mito).

    [1] https://github.com/approximatelabs/sketch

    [2] https://github.com/paddymul/buckaroo

    [3] https://buckaroo-data.readthedocs.io/en/latest/FAQ.html

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • pyjanitor

    Clean APIs for data cleaning. Python implementation of R package Janitor

    Project mention: Sub library with useful code | /r/learnpython | 2023-05-19
  • hamilton

    A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

    Project mention: Write production grade pandas (and other libraries!) with Hamilton | /r/Python | 2023-02-27

    And find the repository here: https://github.com/dagworks-inc/hamilton/

  • optopsy

    A nimble options backtesting library for Python

    Project mention: optopsy: NEW Derivatives and Hedging - star count:605.0 | /r/algoprojects | 2022-08-27
  • hamilton

    A scalable general purpose micro-framework for defining dataflows. You can use it to build dataframes, numpy matrices, python objects, ML models, etc. Embed Hamilton anywhere python runs, e.g. spark, airflow, jupyter, fastapi, python scripts, etc. Comes with lineage out of the box. (by DAGWorks-Inc)

    Project mention: Free access to beta product I'm building that I'd love feedback on | /r/quants | 2023-05-31

    This is me. I drive an open source library Hamilton that people doing time-series/ML work love to use. I'm building a paid product around it at DAGWorks, and I'm after feedback on our current version. Can I entice anyone to:

  • technical

    Various indicators developed or collected for the Freqtrade

  • pandastable

    Table analysis in Tkinter using pandas DataFrames.

    Project mention: interactive table on gui | /r/learnpython | 2022-06-09
  • Daft

    The Python DataFrame for Complex Data

    Project mention: Daft: A High-Performance Distributed Dataframe Library for Multimodal Data | news.ycombinator.com | 2023-06-06
  • eland

    Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

    Project mention: I'm getting elasticsearch.BadRequestError: BadRequestError(400, 'illegal_argument_exception', "specified fields can't be null or empty") using Eland library | /r/elasticsearch | 2023-05-02

    We have a fix for this issue reported here merged and pending a release. Hopefully that release will happen in the next few days, then you can upgrade and the default experience for everyone won't be as confusing :)

  • pystore

    Fast data store for Pandas time-series data

  • static-frame

    Immutable and grow-only Pandas-like DataFrames with a more explicit and consistent interface.

    Project mention: Memoizing DataFrame Functions: Using Hashable DataFrames and Message Digests to Optimize Repeated Calculations | dev.to | 2023-03-01

    StaticFrame is an alternative DataFrame library that offers efficient solutions to this problem, both for in-memory and disk-based memoization.

  • snowpark-python

    Snowflake Snowpark Python API

    Project mention: [GitHub] snowflakedb/snowpark-python: Snowflake Snowpark Python API (open source!) | /r/snowflake | 2022-06-16
  • tablexplore

    Table analysis and plotting application written in PySide2/PyQt5

  • viper

    Simple, expressive pipeline syntax to transform and manipulate data with ease (by aropele)

    Project mention: Simple, expressive pipeline syntax to transform and manipulate data with ease | news.ycombinator.com | 2023-01-24
  • frame-fixtures

    Use compact expressions to create diverse, deterministic DataFrame fixtures with StaticFrame

    Project mention: The Performance Advantage of No-Copy DataFrame Operations | dev.to | 2022-11-21

    To compare performance, we will use the FrameFixtures library to create two DataFrames of 10,000 rows by 1,000 columns of heterogeneous types. For both we can convert the StaticFrame Frame into a Pandas DataFrame.

  • Solomon

    Data Exploration tool.

    Project mention: Solomon: Data Exploration tool. | /r/Python | 2023-03-21
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-06-06.

Python Dataframe related posts

Index

What are some of the best open-source Dataframe projects in Python? This list will help you:

Project Stars
1 modin 8,679
2 vaex 7,913
3 pandas-ta 3,672
4 koalas 3,269
5 PandasGUI 2,939
6 mars 2,596
7 sketch 1,796
8 pyjanitor 1,147
9 hamilton 879
10 optopsy 705
11 hamilton 643
12 technical 594
13 pandastable 564
14 Daft 564
15 eland 522
16 pystore 492
17 static-frame 345
18 snowpark-python 153
19 tablexplore 96
20 viper 14
21 frame-fixtures 6
22 Solomon 1
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com