Introducing tidypolars - a Python data frame package for R tidyverse users

This page summarizes the projects mentioned and recommended in the original post on /r/rstats

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • polars

    Dataframes powered by a multithreaded, vectorized query engine, written in Rust

  • tidypolars uses the polars package as a backend, which might be the fastest data frame manipulation library out there. (Faster even than R's data.table, which has been the king of speed for many years.)

  • db-benchmark

    reproducible benchmark of database-like ops

  • I think having a basic understanding of pandas, given how broadly it's used, is beneficial. That being said, polars seems to be matching or beating data.table in performance, so I think it'd be very worth it to take it up. Wes McKinney, creator of pandas, has been quite vocal about architecture flaws of pandas -- which is why he's been working on the Arrow project. polars is based on Arrow, so in principle it's kinda like pandas 2.0 (adopting the changes that Wes proposed).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Apache Arrow

    Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

  • I think having a basic understanding of pandas, given how broadly it's used, is beneficial. That being said, polars seems to be matching or beating data.table in performance, so I think it'd be very worth it to take it up. Wes McKinney, creator of pandas, has been quite vocal about architecture flaws of pandas -- which is why he's been working on the Arrow project. polars is based on Arrow, so in principle it's kinda like pandas 2.0 (adopting the changes that Wes proposed).

  • tidypolars

    Tidy interface to polars

  • extendr

    R extension library for rust designed to be familiar to R users.

  • tidytable

    Tidy interface to 'data.table'

  • What's cool about this (and /u/GoodAboutHood's other package tidytable) is that they adopt the widely used Tidyverse syntax for high-performance packages without sacrificing speed (and, in my opinion of dtplyr, making it too complicated).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts