Modern Polars: an extensive side-by-side comparison of Polars and Pandas

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • redframes

    General Purpose Data Manipulation Library

  • I'm not GP, but I find the pandas API incredibly inconsistent and difficult to remember how to do simple transformations. For example, it sometimes overloads operators because it doesn't use built in language features like lambdas. There are reasons for the inconsistency, but using the alternatives like R's tidyverse or Julia's DataFramess.jl is like night and day for me.

    I found RedFrames [1] recently which wraps Pandas dataframes with a more consistent interface, it's probably what I'd use if I had to write data transformations that had to be compatible with Pandas.

    [1] https://github.com/maxhumber/redframes

  • dplyr

    dplyr: A grammar of data manipulation

  • It really can't be said enough how pandas is a mess. It has way too much surface area and no common thread pulling it all together. This gets obvious when you work with better dataframe libs like dplyr [1] or DataFramesMeta [2]. I've worked on production systems with all of these libs, this is not gratuitous bashing.

    [1] https://dplyr.tidyverse.org/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

  • Yeah, tried Polars a couple of times: the API seems worse than Pandas to me too. eg the decision only to support autoincrementing integer indexes seems like it would make debugging "hmmm, that answer is wrong, what exactly did I select?" bugs much more annoying. Polars docs write "blazingly fast" all over them but I doubt that is a compelling point for people using single-node dataframe libraries. It isn't for me.

    Modin (https://github.com/modin-project/modin) seems more promising at this point, particularly since a migration path for standing Pandas code is highly desirable.

  • pandoc

    Universal markup converter

  • Not the author but it seems that the site was made using Quarto [1] which uses pandoc [2] behind the scenes for producing the final output. The pandoc website suggests EPUB is possible.

    [1] https://quarto.org/docs/get-started/authoring/text-editor.ht...

    [2] https://pandoc.org/

  • tidypolars

    Tidy interface to polars

  • Thereโ€™s a tidypolars package that appears to be well-maintained https://github.com/markfairbanks/tidypolars

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Generating PDF ๐Ÿ“„ with Python ๐Ÿ

    3 projects | /r/learnpython | 15 Dec 2022
  • Project to rebuild papers with plaintext markup languages

    7 projects | /r/Open_Science | 25 Sep 2021
  • TimesFM (Time Series Foundation Model) for time-series forecasting

    4 projects | news.ycombinator.com | 8 May 2024
  • Beautifying Org Mode in Emacs (2018)

    6 projects | news.ycombinator.com | 15 Apr 2024
  • LaTeX makes me so angry at word

    1 project | news.ycombinator.com | 26 Mar 2024