DuckDB performance improvements with the latest release

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • datafusion

    Apache DataFusion SQL Query Engine

  • Would be curious how the performance compares to [DataFusion](https://github.com/apache/arrow-datafusion) as one of the top contenders to DuckDB on this area (albeit they being different in a lot of parts, I find it one of the closest compared to all others).

    ClickBench (from ClickHouse) has some benchmarks[1] where it can be compared, but am not super sure how up to date it is. At least a while back, they were majorly out of date and haven't looked too closely on whether they are keeping it fair for everyone else :)

    [1]: benchmark.clickhouse.com

  • ClickBench

    ClickBench: a Benchmark For Analytical Databases

  • Looks like a recent PR bumped benchmark.clickhouse.com to DuckDB v0.9 on the 3rd.

    https://github.com/ClickHouse/ClickBench/pull/141

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • duckdb

    DuckDB is an in-process SQL OLAP Database Management System

  • Just had a look (https://github.com/duckdb/duckdb/issues/9399). Yeah it's worrying that such a trivial query returned incorrect results - but credit to the Devs for getting it fixed quickly.

    To my knowledge the only databases that can be described as "military-grade" in terms of testing are SQLite and Postgres.

  • pg-parquet-py

    A python script to write postgres data to a parquet file

  • If you have some data in postgresql and want to query it with duckdb (really fast) you can try extracting the data to a parquet file; this file can then be queried from duckdb with incredible speed. I've written a small program in python that reads from postgresql and exports to parquet for anybody that wanna try it https://github.com/spapas/pg-parquet-py#why

  • db-benchmark

    reproducible benchmark of database-like ops

  • I do think it was important for duckdb to put out a new version of the results as the earlier version of that benchmark [1] went dormant with a very old version of duckdb with very bad performance, especially against polars.

    [1] https://h2oai.github.io/db-benchmark/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts