pandas_flavor
pandas
pandas_flavor | pandas | |
---|---|---|
2 | 1 | |
294 | 33,264 | |
0.7% | - | |
1.2 | 10.0 | |
16 days ago | about 2 years ago | |
Python | Python | |
MIT License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pandas_flavor
-
This OOP habit disturbs me (super().__init__(args accumulation):)
There's established ways to extend pandas btw: - https://pandas.pydata.org/docs/development/extending.html - Also, https://github.com/pyjanitor-devs/pandas_flavor
-
Using Python Classes to Streamline Data Modelling/Cleaning
Check out pandas-flavor. It's a library that lets you register methods to dataframes. There's definitely a time and a place for OO in pandas data processing but your examples can probably be more simply expressed as methods and pandas flavor can make them easy to "find" as extensions of the frame.
pandas
-
Does pandas iterrows have performance issues?
This discussion on GitHub led me to believe it is caused when mixing dtypes in the dataframe, however the simple example below shows it is there even when using one dtype (float64). This takes 36 seconds on my machine:
What are some alternatives?
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
pandas-datareader - Extract data from a wide range of Internet sources into a pandas DataFrame.
modin - Modin: Scale your Pandas workflows by changing a single line of code
gurobipy-pandas - Convenience wrapper for building optimization models from pandas data
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
datasets - 🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
datasloth - Natural language Pandas queries and data generation powered by GPT-3
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
pandas-chat - pandas-ai is a python library that uses ChatGPT prompts to analyze and process pandas data in a conversational way.