Pandas v2.0 Released

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Your link is broken for me, but going to their website and clicking on the 2.0 what's new link takes me to the same URL. They might be updating it... the closest I found was the Sphinx docs source for that: https://github.com/pandas-dev/pandas/blob/main/doc/source/wh...

  • CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  • jupysql

    Better SQL in Jupyter. 📊

    How are people managing the existence of data frame APIs like pandas/polars with SQL engines like BigQuery, Snowflake, and DuckDB?

    Most of my notebooks are a mix of SQL and Python: SQL for most processing, dump the results as a pandas dataframe (via https://github.com/ploomber/jupysql) and then use Python for operations that are difficult to express with SQL (or that I don't know how to do it), so I end up with 80% SQL, 20% Python.

    Unsure if this is the best workflow but it's the most efficient one I've come up with.

    Disclaimer: my team develops JupySQL.

  • polars-benchmark

    Polars author here. I have run the TPC-H benchmark against polars and pandas 2.0 backed by arrow types.

    https://github.com/pola-rs/tpch/pull/36

    Pandas having arrow as backend is great and will make interop with the arrow community (and polars) much better.

    However, if you need performance, polars remains orders of magnitudes faster on whole queries, changing to the arrow memory format does not change that.

  • db-benchmark

    reproducible benchmark of database-like ops

    If interested in benchmarks comparing different dataframe implementations, here is one:

    https://h2oai.github.io/db-benchmark/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • The Design Philosophy of Great Tables (Software Package)

    7 projects | news.ycombinator.com | 4 Apr 2024
  • Read files from s3 using Pandas/s3fs or AWS Data Wrangler?

    3 projects | /r/dataengineering | 6 Dec 2023
  • How to Build and Deploy a Machine Learning model using Docker

    5 projects | dev.to | 30 Jul 2023
  • We are the developers behind pandas, currently preparing for the 2.0 release :) AMA

    9 projects | /r/Python | 1 Mar 2023
  • Talking Data: What do we need for engaging data analytics?

    4 projects | dev.to | 6 Oct 2022

Did you konow that Python is
the 2nd most popular programming language
based on number of metions?