PGM-index
RadixSpline
Our great sponsors
PGM-index | RadixSpline | |
---|---|---|
6 | 3 | |
742 | 118 | |
- | 0.8% | |
2.8 | 0.0 | |
20 days ago | 11 months ago | |
C++ | C++ | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PGM-index
-
Self-indexing RDBMS? Could AI help?
PGM Index
-
Manticore Search 5
Manticore Columnar Library uses Piecewise Geometric Model index, which exploits a learned mapping between the indexed keys and their location in memory. The succinctness of this mapping, coupled with a peculiar recursive construction algorithm, makes the PGM-index a data structure that dominates traditional indexes by orders of magnitude in space while still offering the best query and update time performance.
-
PGM Indexes: Learned indexes that match B-tree performance with 83x less space
Hi @legulere!
Yep, the example of Figure 2 shows only a static PGM-index on a sorted array.
Insertion and deletions are discussed in Section 3 "Dynamic PGM-index" and experimented in Section 7.3.
The Dynamic PGM-index is open-source too: you can find the implementation at https://github.com/gvinciguerra/PGM-index/blob/master/includ... and the documentation at https://pgm.di.unipi.it/docs/cpp-reference/#classpgm_1_1_dyn...
Yep, I'm working on a multidimensional version that I hope to upload to the main repo (https://github.com/gvinciguerra/PGM-index) in a few weeks.
You are right, variable-length strings are difficult. You could try to pack as many characters as possible in a computer word (or in a big int data type), say P characters, and then use the PGM-index to find the strings that share a prefix of P chars with the given query string. I discussed this solution in a GitHub issue (https://github.com/gvinciguerra/PGM-index/issues/8#issuecomm...).
RadixSpline
-
Self-indexing RDBMS? Could AI help?
RadixSpline
-
PGM Indexes: Learned indexes that match B-tree performance with 83x less space
We index geospatial data using a learned index in this work (cf. Section 3): http://cidrdb.org/cidr2021/papers/cidr2021_paper19.pdf
What are some alternatives?
ALEX - A library for building an in-memory, Adaptive Learned indEX
rmi - A learned index structure
manticoresearch - Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
robin-map - C++ implementation of a fast hash map and hash set using robin hood hashing
sdsl-lite - Succinct Data Structure Library 3.0
SOSD - A Benchmark for Learned Indexes
la_vector - 🔶 Compressed bitvector/container supporting efficient random access and rank queries
bolt - 10x faster matrix and vector operations
kudu - Mirror of Apache Kudu
Huffman-Coding - A C++ compression program based on Huffman's lossless compression algorithm and decoder.
LearnedSecondaryIndex - A read-optimized learned index for unsorted data