The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more β
Top 23 Panda Open-Source Projects
-
Pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
30-Days-Of-Python
30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
datasets
π€ The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
-
pandas-ta
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
-
danfojs
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Dash is a Python framework that enables you to build interactive frontend applications without writing a single line of Javascript. Internally and in projects we like to use it in order to build a quick proof of concept for data driven applications because of the nice integration with Plotly and pandas. For this post, I'm going to assume that you're already familiar with Dash and won't explain that part in detail. Instead, we'll focus on what's necessary to make it run serverless.
Project mention: About Data analyst, data scientist and data engineer, resources and experiences | dev.to | 2024-03-26Python Data Science Handbook
π Repository Link: 30 Days of Python
yeah my code needs to use multiprocessing, which does not play nice with tqdm. thanks for the tip about positions though, that helped me search more effectively and came up with two promising comments. unmerged / require some workarounds, but might just work:
https://github.com/tqdm/tqdm/issues/1000#issuecomment-184208...
Get started with Data Science in the Data Science for Beginners curricula.
Project mention: ππ 23 issues to grow yourself as an exceptional open-source Python expert π§βπ» π₯ | dev.to | 2023-10-19
If you are doing data analysis I don't think any of the 3 pieces of software you mentioned are going to be that helpful.
I see these products as tools for data visualization and reporting i.e. presenting prepared datasets to users in a visually appealing way. They aren't as well suited for serious analytics.
I can't comment on Superset or Tableau but I am familiar with Power BI (it has been rolled out across my org), the type of statistics you can do with it are fairly rudimentary. If you need to do any thing beyond summarizing (counts, averages, min, max etc). It is not particularly easy.
For data analysis I use SAS or R. This software allows you do things like multivariate regression, timeseries forecasting, PCA, Cluster analysis etc. There is also plotting capability.
Both these products are kind of old school, I've been using them since early 2000's, the "new school" seems to be Python. Pretty much all the recent data science people in my organization use Python. Particularly Pandas and libraries like Seaborn (https://seaborn.pydata.org/).
The "power" users of Power BI in my organization tend to be finance/HR people for use cases like drill down into cost figures or Interactively presenting KPI's and other headline figures to management things like that.
If you check the file here - https://github.com/ranaroussi/yfinance/blob/main/yfinance/base.py - you can see this is communicated via the "raise Exception('%s: %s' % (self.ticker, err_msg))" line. I'm trying to use the following to catch the exception but no luck.
Project mention: Show HN: Use an "eraser" to clean data on flight without breaking your workflow | news.ycombinator.com | 2024-03-15
[4] "Is it possible to "flatten" structured data (like JSON?)": https://github.com/saulpw/visidata/discussions/1605
The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its βlightning fastβ performance.
Project mention: Grafana Beyla: OSS eBPF auto-instrumentation for application observability | news.ycombinator.com | 2023-09-13
I do not know what is the difference between MACD and MACDFIX but maybe you can take a look how MACD is implemented in pandas_ta library and modify it a bit to achive a behavior you want.
I know I've tooted its horn before, but Orange3 is a pretty neat Python-based GUI platform that makes this and a metric buttload of other statistical/ML techniques available to non-programmer types.
Just watch out for null character `x00` in the corpus. That always seems to kill it stone dead.
https://orangedatamining.com/
https://orange3.readthedocs.io/projects/orange-visual-progra...
Pandas related posts
- Show HN: Hashquery, a Python library for defining reusable analysis
- The Design Philosophy of Great Tables (Software Package)
- Show HN: Use an "eraser" to clean data on flight without breaking your workflow
- Ibis: The portable Python dataframe library
- Ask HN: Problems worth solving with a low-code back end?
- Welcome to 14 days of Data Science!
- Excel Anonymizer-A Python script to anonymize data in Excel files
-
A note from our sponsor - WorkOS
workos.com | 25 Apr 2024
Index
What are some of the best open-source Panda projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Pandas | 41,923 |
2 | PythonDataScienceHandbook | 41,407 |
3 | 30-Days-Of-Python | 31,031 |
4 | tqdm | 27,405 |
5 | data-science-ipython-notebooks | 26,459 |
6 | Data-Science-For-Beginners | 26,290 |
7 | datasets | 18,376 |
8 | ydata-profiling | 12,022 |
9 | Dask | 11,982 |
10 | seaborn | 11,946 |
11 | yfinance | 11,778 |
12 | pandas_exercises | 10,159 |
13 | pygwalker | 9,759 |
14 | modin | 9,465 |
15 | mlcourse.ai | 9,390 |
16 | visidata | 7,409 |
17 | cudf | 7,274 |
18 | py | 6,626 |
19 | pixie | 5,273 |
20 | lux | 4,915 |
21 | pandas-ta | 4,732 |
22 | danfojs | 4,649 |
23 | orange | 4,604 |
Sponsored