Demand the impossible: rigorous database benchmarking

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • ldbc_snb_bi

    Reference implementations for the LDBC Social Network Benchmark's Business Intelligence (BI) workload

  • Rigorous database benchmarking is indeed very difficult and time-consuming. I spent the last ~7 years working on benchmarks for graph processing systems in the Linked Data Benchmark Council (LDBC) [1], originally established in 2012 as an EU research project.

    LDBC creates TPC-style application-level database benchmarks which can be used for system-to-system comparison. We provide detailed specifications, data generators, benchmark frameworks, and multiple reference implementations. The benchmarks are implemented by vendors for their database products, and the implementations submitted to be run by independent third-party auditors to ensure their correctness and reproducibility.

    We have found that there is a market for audits for graph processing systems, albeit it is quite small: over the last 4 years, we have published 34 audited results, see e.g. [2] and [3].

    A major problem we face is that process of implementing the benchmark for a system and getting an audited result is long (and therefore expensive). Vendors spend months implementing the and tuning the benchmarks. It is also typical for the auditor to spend 50+ hours on the auditing process, which includes a lengthy code review step, setting up the system, running the experiments, testing ACID properties, writing a report, etc. The length of the process is exacerbated by the lack of standard graph query languages. This potentially necessitates the auditor to learn a new query language or programming language.

    We have tried to mitigate this problem by improving our documentation, creating more reference implementation, distributing pre-generated data sets. There are new standard graph query languages (SQL/PGQ, GQL) but their adoption is still very limited. Overall, the auditing process is quite long, which is mainly caused by the essential complexity of the problem: implementing an application-level benchmark and getting reliable results is very difficult.

    [1] https://ldbcouncil.org/introduction/

    [2] https://ldbcouncil.org/benchmarks/snb-interactive

    [3] https://ldbcouncil.org/benchmarks/snb-bi/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Try to dump traditional mouse. Click by [Vim] + [screen vision-recognition] way

    1 project | news.ycombinator.com | 20 May 2024
  • Utilizing Coverage AI Agents for Better Unit Tests

    2 projects | dev.to | 20 May 2024
  • You Can Set Up a Home Security Camera System Without Using the Cloud

    1 project | news.ycombinator.com | 20 May 2024
  • A Command line memorable password generator. Now in Python.

    1 project | dev.to | 20 May 2024
  • MISP galaxy – cybersecurity and other related knowledge base

    1 project | news.ycombinator.com | 20 May 2024