data.table vs datatable

data.table

R's data.table package extends data.frame: (by Rdatatable)

Suggest topics

Source Code

r-datatable.com

Suggest alternative

Edit details

datatable

A Python package for manipulating 2-dimensional tabular data structures (by h2oai)

Python Data Analysis Data Structure Performance ftrl

Source Code

datatable.readthedocs.io

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

data.table		datatable
	Project
16	Mentions	9
3,478	Stars	1,790
0.8%	Growth	0.8%
9.6	Activity	6.1
2 days ago	Latest Commit	5 months ago
R	Language	C++
Mozilla Public License 2.0	License	Mozilla Public License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

data.table

Posts with mentions or reviews of data.table. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-12-21.

Data.table: R's data.table package extends data.frame
1 project | news.ycombinator.com | 15 Mar 2024
Discovering Copy-on-Write in R
1 project | news.ycombinator.com | 21 Dec 2023

The data.table package may also make a huge difference in performance, and often simplifies the code as well https://github.com/Rdatatable/data.table
new governance being proposed for data.table
1 project | /r/rstats | 11 Sep 2023
Local development environment for the data.table R project
1 project | dev.to | 20 Mar 2023

After the partial success with the development environment for R-yaml we tried another R package called data.table as part of the Open Source Development Course. Eventually we managed to run the tests of this too.
Alternative to Pandas
3 projects | /r/Python | 21 Dec 2022

There's datatable. I haven't used it much, but the R version (data.table) is phenomenal.
Do python packages have long form documentation? If so can someone provide me a sample?
1 project | /r/learnpython | 25 Nov 2022

data.table README.md
How to move “time” to a new column
1 project | /r/Rlanguage | 17 Sep 2022

That's an old bug in data.table v1.12.2. It's been fixed for a while now. If you update your data.table version (e.g., install.packages("data.table") ) and retry then it should work fine.
Hiring an R coder to improve efficiency of code?
3 projects | /r/rstats | 14 Sep 2022

Some suggestions: (1) https://github.com/Rdatatable/data.table Code based on the data.table will probably be fastest. There are a number of reasons for this. More here: https://cran.r-project.org/web/packages/data.table/vignettes/ and here: https://rdatatable.gitlab.io/data.table/library/data.table/html/datatable-optimize.html The GForce set of optimizations is well explained here: https://www.brodieg.com/2019/02/24/a-strategy-for-faster-group-statisitics/ (2) setDTthreads() is your friend in data.table (3) I have found (on Windows at least) Microsoft Open R use of parallel MKL faster than CRAN's latest release. See https://mran.microsoft.com/documents/rro/multithread Microsoft recommends using setMKLthreads() if it will help. (4) I think rfast ( https://github.com/RfastOfficial/Rfast ) is a library worth considering although I don't know if it will help you with brms and stan operations.
Piping in R is like baking!
3 projects | /r/rstats | 13 Jun 2022

Take a look at the 22nd new feature of v1.14.3 on development here.
memory leak after data.table::fread()?
1 project | /r/Rlanguage | 5 Apr 2022

datatable

Posts with mentions or reviews of datatable. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-17.

Cheat Sheets for data.table to Python's pandas syntax?
1 project | /r/Rlanguage | 20 Jun 2023

Aside from that, there is a Python translation of data.table (see documentation here), which might be worth looking into. However, it hasn't had any major updates in a while: the last release 2 years ago ...
Any advice on using Pandas as a data analyst?
2 projects | /r/datascience | 17 Jan 2023
Alternative to Pandas
3 projects | /r/Python | 21 Dec 2022

There's datatable. I haven't used it much, but the R version (data.table) is phenomenal.
Need advice on whether to store data set for regression model in SQL database or by using Python modules like Pickle or Parquet
1 project | /r/algotrading | 23 May 2022

just use HDF5 or Parquet, or CSV + https://github.com/h2oai/datatable to speed up the file reading.
Massive R analysis of Data Science Language and Job Trends 2022
2 projects | /r/rstats | 29 Jan 2022
Scikit-Learn Version 1.0
11 projects | news.ycombinator.com | 14 Sep 2021

> For me I had with pandas the most issues using it's multiindex.
Yessss. I loathe indices, and have never been in a situation where I was better off with them than without them.
> Regarding fast you have something like Vaex on python sid
I've never used Vaex, but I've used datatable (https://github.com/h2oai/datatable) and polars (https://github.com/pola-rs/polars). Polars is my favorite API, but datatable was faster at reading data (Polars was faster in execution). I'll have to give Vaex a try at some point.
Show HN: Sheet2dict – simple Python XLSX/CSV reader/to dictionary converter
5 projects | news.ycombinator.com | 21 Apr 2021
Hey Reddit, here's my comprehensive course on Python Pandas, for free.
1 project | /r/Python | 1 Feb 2021

Yep. I think this is the downside to a package being entirely maintained by volunteers. In any case, Pandas is still the leading data wrangling package for Python. (I'm excited to see how datatable evolves.)
Ditching Excel for Python in a Legacy Industry (Reinsurance)
3 projects | news.ycombinator.com | 30 Dec 2020

h2o's data.table clone is fine
https://github.com/h2oai/datatable

What are some alternatives?

When comparing data.table and datatable you can also consider the following projects:

vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust

rust-dataframe - A Rust DataFrame implementation, built on Apache Arrow

DataFrame - C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage

siuba - Python library for using dplyr like syntax with pandas and SQL

db-benchmark - reproducible benchmark of database-like ops

TypedTables.jl - Simple, fast, column-based storage for data analysis in Julia

scientific-visualization-book - An open access book on scientific visualization using python and matplotlib

gsir-te - Getting Started in R -- Tinyverse Edition

sktime - A unified framework for machine learning with time series

ballista - Distributed compute platform implemented in Rust, and powered by Apache Arrow.

vinum - Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.

data.table vs vaex datatable vs polars data.table vs rust-dataframe datatable vs DataFrame data.table vs siuba datatable vs db-benchmark data.table vs TypedTables.jl datatable vs scientific-visualization-book data.table vs gsir-te datatable vs sktime data.table vs ballista datatable vs vinum

Compare data.table vs datatable and see what are their differences.

data.table

datatable

data.table

datatable

What are some alternatives?