rustle
peaks-consolidation
rustle | peaks-consolidation | |
---|---|---|
1 | 37 | |
7 | 102 | |
- | - | |
0.0 | 9.6 | |
about 2 years ago | 5 months ago | |
Go | Go | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
rustle
peaks-consolidation
-
Filter a 7 billion-row dataset using 32GB Memory
Script and Data
- Success in Lighting Fast Billion-Row Sorting Using a 32GB Desktop PC
-
Stochastic Sublinear Streaming Algorithms
This streaming algorithms will support billion-row sorting
-
Billion-row Sorting Scripts for Peaks, Polars, Pandas and DuckDB
Sample data for 100,000 rows
-
Understand The SQL Execution Order
SplitFile2Folder: allows to filter a big CSV file or a folder which contains many CSV file to a folder/sub-folder which results many table partitions
-
OrderBy{Ledger(A) Account(D) Quantity(FloatA) Part_No(A)}
Now I am upgrading the Peaks, not only support orderby, but also support billion-row sorting use cases using only 32GB memory.
-
Success to Build 2 functions: SplitFile2Folder and FilterFromFolder
BillionRowsTestingLog is a set of processing logs included in the 1st pre-release delivery. These are foucs on billion-row databending exercises.
-
Implement Go Streaming to Process Over Memory Size Dataset
I have decided to share my written algorithms for a period of time by spending additional time to maintain the repository.
- Solved to Filter/Summarize Data from Huge CSV Files
- Solved Huge CSV File
What are some alternatives?
imcache - A zero-dependency generic in-memory cache Go library
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
buntdb - BuntDB is an embeddable, in-memory key/value database for Go with custom indexing and geospatial support
Fieldnotes - Public repository of my field notes from 25+ years as computer guy
hazelcast-go-client - Hazelcast Go Client
gobyexample - Go by Example
flashdb - FlashDB is an embeddable, in-memory key/value database in Go (with Redis like commands and super easy to read)
polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
GCache - An in-memory cache library for golang. It supports multiple eviction policies: LRU, LFU, ARC
chi - lightweight, idiomatic and composable router for building Go HTTP services
gin-boilerplate - The fastest way to deploy a restful api's with Gin Framework with a structured project that defaults to PostgreSQL database and JWT authentication middleware stored in Redis
peaks-framework - The Peaks Consolidation is equipped with state-of-the-art algorithms and data structures that support high-performance databending exercises. It specializes in management accounting and consolidation, with some special topics in machine learning and bioinformatics. [Moved to: https://github.com/hkpeaks/peaks-consolidation]