Importing 3m rows/sec with io_uring

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • ClickBench

    ClickBench: a Benchmark For Analytical Databases

  • Recently ClickHouse conducted a benchmark for their own database and many others, including QuestDB. The benchmark included data import as the first step. Since we were in the process of building a faster import, this benchmark provided us with nice test data and baseline results. So, what have we achieved? Let's find out. The benchmark was using QuestDB's HTTP import endpoint to ingest the data into an existing non-partitioned table. You may wonder why it doesn't use a partitioned table, which stores the data sorted by the timestamp values and provides many benefits for time series analysis. Most likely, the reason is terrible import execution time. Both HTTP-based import and pre-6.5 COPY SQL command are simply not capable of importing a big CSV file with unsorted data. Thus, the benchmark opts for a non-partitioned table with no designated timestamp column. The test CSV file may be downloaded and uncompressed following the commands:

  • fio

    Flexible I/O Tester

  • Luckily, newer Linux kernel versions support io_uring, a new asynchronous I/O interface. But would it help in our case? Learning the answer is simple and, in fact, doesn't even require a single line of code, thanks to fio, a very flexible I/O tester utility.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • QuestDB

    An open source time-series database for fast ingest and SQL queries

  • As usual, we encourage you to try out the latest QuestDB 6.5.2 release and share your feedback with our Slack Community. You can also play with our live demo to see how fast it executes your queries. And, of course, contributions to our open source project on GitHub are more than welcome.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts