Introducing tidypolars - a Python data frame package for R tidyverse users

This page summarizes the projects mentioned and recommended in the original post on /r/rstats

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. polars

    Dataframes powered by a multithreaded, vectorized query engine, written in Rust

    tidypolars uses the polars package as a backend, which might be the fastest data frame manipulation library out there. (Faster even than R's data.table, which has been the king of speed for many years.)

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. db-benchmark

    reproducible benchmark of database-like ops

    I think having a basic understanding of pandas, given how broadly it's used, is beneficial. That being said, polars seems to be matching or beating data.table in performance, so I think it'd be very worth it to take it up. Wes McKinney, creator of pandas, has been quite vocal about architecture flaws of pandas -- which is why he's been working on the Arrow project. polars is based on Arrow, so in principle it's kinda like pandas 2.0 (adopting the changes that Wes proposed).

  4. Apache Arrow

    Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

    I think having a basic understanding of pandas, given how broadly it's used, is beneficial. That being said, polars seems to be matching or beating data.table in performance, so I think it'd be very worth it to take it up. Wes McKinney, creator of pandas, has been quite vocal about architecture flaws of pandas -- which is why he's been working on the Arrow project. polars is based on Arrow, so in principle it's kinda like pandas 2.0 (adopting the changes that Wes proposed).

  5. tidypolars

    Tidy interface to polars

  6. extendr

    R extension library for rust designed to be familiar to R users.

  7. tidytable

    Tidy interface to 'data.table'

    What's cool about this (and /u/GoodAboutHood's other package tidytable) is that they adopt the widely used Tidyverse syntax for high-performance packages without sacrificing speed (and, in my opinion of dtplyr, making it too complicated).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts