PGM Indexes: Learned indexes that match B-tree performance with 83x less space

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • PGM-index

    🏅State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes

  • Hi @legulere!

    Yep, the example of Figure 2 shows only a static PGM-index on a sorted array.

    Insertion and deletions are discussed in Section 3 "Dynamic PGM-index" and experimented in Section 7.3.

    The Dynamic PGM-index is open-source too: you can find the implementation at https://github.com/gvinciguerra/PGM-index/blob/master/includ... and the documentation at https://pgm.di.unipi.it/docs/cpp-reference/#classpgm_1_1_dyn...

  • ALEX

    A library for building an in-memory, Adaptive Learned indEX (by microsoft)

  • Also, a learned index from Microsoft: https://github.com/microsoft/ALEX

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • la_vector

    🔶 Compressed bitvector/container supporting efficient random access and rank queries

  • Hi Jouni!

    You may find interesting these other papers of ours:

    - The ALENEX21 paper "A 'learned' approach to quicken and compress rank/select dictionaries" (http://pages.di.unipi.it/vinciguerra/publication/learned-ran..., https://github.com/gvinciguerra/la_vector), where we introduce a compressed bitvector supporting efficient rank and select queries, which is competitive with several well-established implementations of succinct data structures.

    - The ICML20 paper "Why are learned indexes so effective?" (http://pages.di.unipi.it/vinciguerra/publication/learned-ind...) where we prove that, under some general assumptions on the input data, the space of the PGM-index is actually O(n/B^2) whp (versus Θ(n/B) of classic B-trees).

  • SOSD

    A Benchmark for Learned Indexes

  • For a detailed study of learned indexes, see this work: https://vldb.org/pvldb/vol14/p1-marcus.pdf

    All code is available in open source: https://github.com/learnedsystems/SOSD

  • RadixSpline

    A Single-Pass Learned Index

  • We index geospatial data using a learned index in this work (cf. Section 3): http://cidrdb.org/cidr2021/papers/cidr2021_paper19.pdf

    Code: https://github.com/learnedsystems/RadixSpline

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Self-indexing RDBMS? Could AI help?

    3 projects | /r/Database | 26 Apr 2023
  • RadixSpline: A Single-Pass Learned Index

    1 project | news.ycombinator.com | 25 Jan 2021
  • rosedb: A Lightweight Key/Value Storage Engine in Go

    1 project | /r/golang | 30 Jun 2023
  • Rosedb: Lightweight, fast and reliable key/value storage engine

    1 project | news.ycombinator.com | 30 Jun 2023
  • haskell todo list app (beginner)

    3 projects | /r/haskell | 8 Jun 2023