Python for Data Analysis, 3rd Edition – The Open Access Version Online

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • star-history

    The missing star history graph of GitHub repos - https://star-history.com

  • db-benchmark

    reproducible benchmark of database-like ops

    I think that neither "outlier toolchain" nor "some percentage increase" are fair. This benchmark [0] show significant speedup while lowering the memory needs. You still need to reach dask/spark for really big data where you need a cluster of beefy computers for your tasks.

    If you use an r5d.24xlarge-like[1] instance, you can skip spark/dask for most workflows as 768 GB is plenty enough. On top of that, polars will efficiently use the 96 available cores when you are computing your join, groupby, etc.

    Also polars is getting more and more popular[2]

    [0] -- https://h2oai.github.io/db-benchmark/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • xarray

    N-D labeled arrays and datasets in Python

    Does polars have N-D labelled arrays, and if so can it perform computations on them quickly? I've been thinking of moving from pandas to xarray [0], but might consider poplars too if it has some of that functionality.

    [0] https://xarray.dev/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts