bustub VS ClickHouse

Compare bustub vs ClickHouse and see what are their differences.

bustub

The BusTub Relational Database Management System (Educational) (by cmu-db)
Our great sponsors
  • Nanos - Run Linux Software Faster and Safer than Linux with Unikernels
  • Scout APM - A developer's best friend. Try free for 14-days
  • SaaSHub - Software Alternatives and Reviews
bustub ClickHouse
4 40
921 20,779
6.0% 3.4%
7.1 10.0
6 days ago 4 days ago
C++ C++
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

bustub

Posts with mentions or reviews of bustub. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-01-28.

ClickHouse

Posts with mentions or reviews of ClickHouse. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-11-28.
  • Stream Processing Database
    4 projects | reddit.com/r/Database | 28 Nov 2021
    There's ksqldb (open source, built with java) and materialize (there's standalone edition), both need to use Kafka/RedPanda, also Clickhouse (open source, with materialize view with specific engine, but need to buffer the inserts using proxy like KittenHouse or buffering library like ch-timed-buffer), is there any other alternative to those 3 (that similarly doesn't do full scan to do aggregation)?
  • Open Source Analytics Stack: Bringing Control, Flexibility, and Data-Privacy to Your Analytics
    15 projects | dev.to | 25 Nov 2021
    Moreover, using open-source warehouse tools can allow unlocking additional insights from your data in real-time and at a lesser cost. PostgreSQL (website, repo) is a popular example of an efficient and low-cost data warehousing solution. Another example is ClickHouse (website, GitHub), an open-source, analytics-focused DBMS that allows generating analytical reports from data in real-time using SQL.
  • Welcome to the free open-source OLAP server project
    2 projects | dev.to | 15 Nov 2021
    The most efficient way is to use column store databases as data sources for eMondrian. For example, ClickHouse could run as a powerful and fast query engine while eMondrian works as a proxy representing data as cubes and executing MDX queries.
  • How to speed up ClickHouse queries using materialized columns
    1 project | dev.to | 11 Nov 2021
    As of writing, there's a feature request on Github for adding specific commands for materializing specific columns on ClickHouse data parts.
    1 project | news.ycombinator.com | 26 Oct 2021
    Nice article. Materialized columns in ClickHouse are a bit like indexes in the sense that they give a faster path to the answer by reading less data.

    ClickHouse devs are adding a feature called semi-structured data that will optimize stored JSONs to columnar storage and also add convenient query syntax. [0] At that point the trade-off between stored JSON blobs and and materialized columns will become a lot less stark than it is today.

    [0] https://github.com/ClickHouse/ClickHouse/issues/17623

  • To what extent do you use SQL in your job?
    1 project | reddit.com/r/datascience | 30 Oct 2021
    I'm not a business analyst but a software developer. I've worked quite a bit with event data. Think "Order Completed", "User Signed Up" and "Subscription Cancelled". When those events get channelled into a column-store database like Redshift or Clickhouse, you can answer a lot of advanced questions using SQL. In particular, Clickhouse has lots of useful functions for analysing datasets. See this analysis of GitHub events as an example.
  • What is ClickHouse how it compares to PostgreSQL and TimescaleDB for time series
    11 projects | news.ycombinator.com | 21 Oct 2021
    Hi Ajay! Thanks for the thoughtful response and email. I would love a direct meeting and will contact you shortly.

    I don't mean to gloss over ClickHouse imperfections. There are lots of them. For my money the biggest is that it still takes way too much expertise in ClickHouse for ordinary developers to use it effectively. Part of that is SQL compatibility, part of it is lack of tools of which simple backup is certainly one. To the extent that ClickHouse is risky, the risk is finding (and retaining) staff who can use it properly. Our business at Altinity exists in large part because of this risk, so I know it's real.

    The big aha! experience for me has been that the things like lack of ACID transactions or weak backup mechanisms are not necessarily the biggest issues for most ClickHouse users. I came to ClickHouse from a long background in RDBMS and transactional replication. Things that would be game ending in that environment are not in analytic systems.

    What's more interesting (mind-expanding even) is that techniques like deduplication of inserted blocks and async multi-master replication turn out to be just as important as ACID & backups to achieve reliable systems. Furthermore, services like Kafka that allow you to have DC-level logs are an essential part of building analytic applications that are reliable and performant at scale. We're learning about these mechanisms in the same way that IBM and others developed ACID transaction ideas in the 1970s--by solving problems in real systems. It's really fun to be part of it.

    My comment didn't convey this clearly, for which I heartily apologize. I certainly don't intend to portray ClickHouse as perfect and still less to bash Timescale. I don't know enough about the latter to make any criticism worth reading.

    p.s., Non-transactional insert (specifically non-atomicity across blocks and tables) is an undisputed problem. It's being fixed in https://github.com/ClickHouse/ClickHouse/issues/22086. Altinity and others are working on backups. Backup comes up in my job just about every day.

    11 projects | news.ycombinator.com | 21 Oct 2021
    One thing I was surprised to see is that ClickHouse and ElasticSearch have the same number of contributors. That's pretty astounding given how much older and more prominent ElasticSearch has been.

    https://github.com/ClickHouse/ClickHouse/graphs/contributors

    https://github.com/elastic/elasticsearch/graphs/contributors

  • Recommend a service for storing large amount of data except for Big Table
    1 project | reddit.com/r/googlecloud | 30 Sep 2021
    If this is analytical data, you could run your own Clickhouse system for cheap that could handle this load, though whole sale extracting the data later on would be tricky. Otherwise, and I know you said no, BigTable is probably the answer.
  • I Don't Think Elasticsearch Is a Good Logging System
    8 projects | news.ycombinator.com | 28 Sep 2021

What are some alternatives?

When comparing bustub and ClickHouse you can also consider the following projects:

VictoriaMetrics - VictoriaMetrics: fast, cost-effective monitoring solution and time series database

Trino - Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

MongoDB Libbson

TimescaleDB - An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

RocksDB - A library that provides an embeddable, persistent key-value store for fast storage.

PostgreSQL - Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

Adminer - Database management in a single PHP file

arrow-datafusion - Apache Arrow DataFusion and Ballista query engines

TileDB - The Universal Storage Engine

LevelDB - LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

cpp_redis

MySQL - MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.