interpolative_coding
PGM-index
interpolative_coding | PGM-index | |
---|---|---|
1 | 6 | |
27 | 769 | |
- | - | |
0.0 | 6.2 | |
over 1 year ago | about 2 months ago | |
C++ | C++ | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
interpolative_coding
-
Time-Series Compression Algorithms
I didn't see binary interpolative coding (BIC) referenced. It is one of my favorites introduced to me by the book "Managing Gigabytes" by Moffett and Bell [1]. It has great compression ratio for sequences and is commonly used in inverted indexes.
There is neat implementation [2] and technical paper [3] by Giulio Ermanno Pibiri, which I just found today by looking for it.
[1] https://people.eng.unimelb.edu.au/ammoffat/mg/
[2] https://github.com/jermp/interpolative_coding
[3] http://pages.di.unipi.it/pibiri/papers/BIC.pdf
PGM-index
-
Self-indexing RDBMS? Could AI help?
PGM Index
- Piecewise Geometric Model Index
-
Manticore Search 5
Manticore Columnar Library uses Piecewise Geometric Model index, which exploits a learned mapping between the indexed keys and their location in memory. The succinctness of this mapping, coupled with a peculiar recursive construction algorithm, makes the PGM-index a data structure that dominates traditional indexes by orders of magnitude in space while still offering the best query and update time performance.
-
PGM Indexes: Learned indexes that match B-tree performance with 83x less space
Yep, I'm working on a multidimensional version that I hope to upload to the main repo (https://github.com/gvinciguerra/PGM-index) in a few weeks.
What are some alternatives?
simple8b-timeseries-compression
ALEX - A library for building an in-memory, Adaptive Learned indEX
simple8b-timeseries-compr
manticoresearch - Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
FastPFor - The FastPFOR C++ library: Fast integer compression
robin-map - C++ implementation of a fast hash map and hash set using robin hood hashing
sdsl-lite - Succinct Data Structure Library 3.0
SOSD - A Benchmark for Learned Indexes
RadixSpline - A Single-Pass Learned Index
bolt - 10x faster matrix and vector operations
Huffman-Coding - A C++ compression program based on Huffman's lossless compression algorithm and decoder.
la_vector - 🔶 Compressed bitvector/container supporting efficient random access and rank queries