ArcticDB
DataFrame
ArcticDB | DataFrame | |
---|---|---|
4 | 109 | |
1,123 | 2,280 | |
7.4% | - | |
9.8 | 9.4 | |
2 days ago | 3 days ago | |
C++ | C++ | |
GNU General Public License v3.0 or later | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ArcticDB
-
Speed Test - ArcticDB, HDF, Feather, Parquet
ArcticDB is a new data store for pandas DataFrames (https://arcticdb.io/). I have no affiliation with the project but wanted to see how it would compare on speed versus the other file format storage options available in Pandas: HDF, Feather, and Parquet. I could not find much on-line about how Arctic compares to the other options in terms of speed, so I ran some tests myself.
- ArcticDB – A DataFrame Database
- ArcticDB: A high-performance, serverless Pandas DataFrame database
- ArcticDB: A high performance, serverless DataFrame database
DataFrame
- New multithreaded version of C++ DataFrame was released
- DataFrame: NEW Data - star count:2013.0
-
C++ DataFrame vs. Polars
For a while, I have been hearing that Polars is so frighteningly fast that you shouldn’t look directly at it with unprotected eyes. So, I finally found time to learn a bit about Polars and write a very simple test/comparison for C++ DataFrame vs. Polars.
-
C++ Show and Tell - July 2023
I have worked on C++ DataFrame for the past 5+ years in my spare times. It is comparable to Pandas or R data.frame, although it includes a lot more functionality.
- Allocators; one of the ignored souls of STL
What are some alternatives?
prometheus - The Prometheus monitoring system and time series database.
datatable - A Python package for manipulating 2-dimensional tabular data structures
etcd - Distributed reliable key-value store for the most critical data of a distributed system
db-benchmark - reproducible benchmark of database-like ops
tidb - TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://tidbcloud.com/free-trial
sktime - A unified framework for machine learning with time series
polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
zhetapi - A C++ ML and numerical analysis API, with an accompanying scripting language.
faiss - A library for efficient similarity search and clustering of dense vectors.
scientific-visualization-book - An open access book on scientific visualization using python and matplotlib
Tiger - C++ Matrix -- High performance and accurate (e.g. edge cases) matrix math library with expression template arithmetic operators
skorch - A scikit-learn compatible neural network library that wraps PyTorch