Pandas AI – The Future of Data Analysis

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

sketch

20 2,200 4.4 Python

AI code-writing assistant that understands data content

This morning I added a "Related Projects" [3] Section to the Buckaroo docs. If Buckaroo doesn't solve your problem, look at one of the other linked projects (like Mito).
[1] https://github.com/approximatelabs/sketch
[2] https://github.com/paddymul/buckaroo
[3] https://buckaroo-data.readthedocs.io/en/latest/FAQ.html

Pandas

399 42,039 10.0 Python

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

I asked GPT-4 this
can you visit https://pandas.pydata.org/about/governance.html and tell me if I am allowed to use the term 'pandas' in the name of another unaffiliated project, for example 'pandas-ai'
--
Based on the BSD 3-Clause License under which pandas is released, neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission[ 1] . This means that to use the term 'pandas' in the name of another unaffiliated project such as 'pandas-ai', you would likely need to get written permission from the pandas project's copyright holders.
However, please note that this is not legal advice, and it would be a good idea to consult with a lawyer who specializes in open-source software or intellectual property law to ensure that you're in compliance with all legal requirements.
[1] https://github.com/pandas-dev/pandas/blob/main/LICENSE

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
mito

18 2,223 10.0 Python

The mitosheet package, trymito.io, and other public Mito code.

I think the biggest area for growth for LLM based tools for data analysis is around helping users _understand what edits they actually made_.
I'm a co-founder of a non-AI data code-gen tool for data analysis -- but we also have a basic version of an LLM integration. The problem we see with tooling like Pandas AI (in practice! with real users at enterprises!) is that users make an edit like "remove NaN values" and then get a new dataframe -- but they have no way of checking if the edited dataframe is actually what they want. Maybe the LLM removed NaN values. Maybe it just deleted some random rows!
The key here: how can users build an understanding of how their data changed, and confirm that the changes made by the LLM are the changes they wanted. In other words, recon!
We've been experimenting more with this recon step in the AI flow (you can see the final PR here: https://github.com/mito-ds/monorepo/pull/751). It takes a similar approach to the top comment (passing a subset of the data to the LLM), and then really focuses in the UI around "what changes were made." There's a lot of opportunity for growth here, I think!
Any/all feedback appreciated :)

pandas-ai

14 11,051 9.8 Python

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

The medium article is ok, if blocked at times. This is just a summary, not by the package author.
You can jump to the code at https://github.com/gventuri/pandas-ai to see more of what it's trying to do.

pandasql

3 1,286 0.0 Python

sqldf for pandas
buckaroo

10 160 8.9 Jupyter Notebook

Buckaroo - the data wrangling assistant for pandas. Quickly explore dataframes, and run pandas commands via a GUI. Works inside the jupyter notebook.

This morning I added a "Related Projects" [3] Section to the Buckaroo docs. If Buckaroo doesn't solve your problem, look at one of the other linked projects (like Mito).
[1] https://github.com/approximatelabs/sketch
[2] https://github.com/paddymul/buckaroo
[3] https://buckaroo-data.readthedocs.io/en/latest/FAQ.html

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

The Design Philosophy of Great Tables (Software Package)

7 projects | news.ycombinator.com | 4 Apr 2024
Welcome to 14 days of Data Science!

1 project | dev.to | 7 Mar 2024
Read files from s3 using Pandas/s3fs or AWS Data Wrangler?

3 projects | /r/dataengineering | 6 Dec 2023
What codegen is (actually) good for

2 projects | news.ycombinator.com | 28 Sep 2023
Data Science for Beginners - A Curriculum

1 project | /r/programming | 8 Sep 2023

Pandas AI – The Future of Data Analysis

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Data Science Python Pandas Data Analysis Data
Post date: 17 May 2023

sketch

Pandas

InfluxDB

mito

pandas-ai

pandasql

buckaroo

Related posts

The Design Philosophy of Great Tables (Software Package)

Welcome to 14 days of Data Science!

Read files from s3 using Pandas/s3fs or AWS Data Wrangler?

What codegen is (actually) good for

Data Science for Beginners - A Curriculum

Pandas AI – The Future of Data Analysis

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Data Science Python Pandas Data Analysis Data Post date: 17 May 2023

sketch

Pandas

InfluxDB

mito

pandas-ai

pandasql

buckaroo

Related posts

The Design Philosophy of Great Tables (Software Package)

Welcome to 14 days of Data Science!

Read files from s3 using Pandas/s3fs or AWS Data Wrangler?

What codegen is (actually) good for

Data Science for Beginners - A Curriculum

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Data Science Python Pandas Data Analysis Data
Post date: 17 May 2023