What is ClickHouse how it compares to PostgreSQL and TimescaleDB for time series

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • tsbs

    Time Series Benchmark Suite, a tool for comparing and evaluating databases for time series data

    Hello @PeterZaitsev!

    Actually Altinity is the one that contributed the bits to TSBS for benchmarking ClickHouse[1], so we are using the work that they contributed (and anyone is welcome to make a PR for updates or changes). We also had a former ClickHouse engineer look at the setup to verify it matched best practices with how CH is currently designed, given the TSBS dataset.

    As for the optimizations in the article you pointed to from 2019 (specifically how to query "last point" data more efficiently in ClickHouse), it uses a different table type (AggregatedMergeTree) and a materialized view to get better query response times for this query type.

    We (or someone in the community) could certainly add that optimization to the benchmark, but it wouldn't be using raw data - which we didn't think was appropriate for the benchmark analysis. But if one wanted to use that optimization, then one should also use Continuous Aggregates for TimescaleDB - ie for an apples to apples comparison - which I think would also lead to similar results to what we show today.

    It's actually something we've talked about adding to TSBS for TimescaleDB (as an option to turn on/off) and maybe other DBs could do the same.

    [1]: https://github.com/timescale/tsbs/pull/26

  • ClickHouse

    ClickHouse® is a free analytics DBMS for big data

    Hi Ajay! Thanks for the thoughtful response and email. I would love a direct meeting and will contact you shortly.

    I don't mean to gloss over ClickHouse imperfections. There are lots of them. For my money the biggest is that it still takes way too much expertise in ClickHouse for ordinary developers to use it effectively. Part of that is SQL compatibility, part of it is lack of tools of which simple backup is certainly one. To the extent that ClickHouse is risky, the risk is finding (and retaining) staff who can use it properly. Our business at Altinity exists in large part because of this risk, so I know it's real.

    The big aha! experience for me has been that the things like lack of ACID transactions or weak backup mechanisms are not necessarily the biggest issues for most ClickHouse users. I came to ClickHouse from a long background in RDBMS and transactional replication. Things that would be game ending in that environment are not in analytic systems.

    What's more interesting (mind-expanding even) is that techniques like deduplication of inserted blocks and async multi-master replication turn out to be just as important as ACID & backups to achieve reliable systems. Furthermore, services like Kafka that allow you to have DC-level logs are an essential part of building analytic applications that are reliable and performant at scale. We're learning about these mechanisms in the same way that IBM and others developed ACID transaction ideas in the 1970s--by solving problems in real systems. It's really fun to be part of it.

    My comment didn't convey this clearly, for which I heartily apologize. I certainly don't intend to portray ClickHouse as perfect and still less to bash Timescale. I don't know enough about the latter to make any criticism worth reading.

    p.s., Non-transactional insert (specifically non-atomicity across blocks and tables) is an undisputed problem. It's being fixed in https://github.com/ClickHouse/ClickHouse/issues/22086. Altinity and others are working on backups. Backup comes up in my job just about every day.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • Elasticsearch

    Free and Open, Distributed, RESTful Search Engine

    One thing I was surprised to see is that ClickHouse and ElasticSearch have the same number of contributors. That's pretty astounding given how much older and more prominent ElasticSearch has been.

    https://github.com/ClickHouse/ClickHouse/graphs/contributors

    https://github.com/elastic/elasticsearch/graphs/contributors

  • clickhouse-operator

    Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes

    Don't use helm. The ClickHouse Kubernetes Operator is the way to go. Here's the project: https://github.com/Altinity/clickhouse-operator

    This is generally true for most databases these days. Use an operator if it's available. Helm can't handle the dynamic management required to run databases properly.

  • VictoriaMetrics

    VictoriaMetrics: fast, cost-effective monitoring solution and time series database

    I think it is worth noting while Clickhouse is often used for time series store it is not particularly designed for this use case, but more for storing logs, events and similar data. VictoriaMetrics would be interesting comparable which is inspired by Clickhouse design but Optimized for time series store in particular https://victoriametrics.com/

  • dbt-clickhouse

    The Clickhouse plugin for dbt (data build tool)

    Is your comment on ClickHouse and DBT based on using the DBT ClickHouse plugin? [0] If so I would be very interested in understanding what you or others see as deficiencies.

    [0] https://github.com/silentsokolov/dbt-clickhouse

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts