starrocks vs duckdb

starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software. (by StarRocks)

Source Code

starrocks.io

Suggest alternative

Edit details

duckdb

DuckDB is an in-process SQL OLAP Database Management System (by duckdb)

SQL Database Olap Analytics embedded-database

Source Code

duckdb.org

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

starrocks		duckdb
	Project
12	Mentions	52
7,764	Stars	16,749
4.9%	Growth	10.7%
10.0	Activity	10.0
4 days ago	Latest Commit	about 7 hours ago
Java	Language	C++
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

starrocks

Posts with mentions or reviews of starrocks. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-09.

A MySQL compatible database engine written in pure Go
10 projects | news.ycombinator.com | 9 Apr 2024

tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb
Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks
StarRocks – sub-second MPP OLAP database for full analytics scenarios
1 project | news.ycombinator.com | 23 Jan 2024
Let's Talk about Joins
2 projects | news.ycombinator.com | 20 Jan 2024

I think you're talking about doing denormalization before importing data into an OLAP system to avoid subsequent joins. However, this greatly limits the flexibility of data modeling. Moreover, denormalization can be a headache-inducing process. In fact, I have tested StarRocks (https://github.com/StarRocks/starrocks), and it is capable of performing joins while streaming data imports, and the speed is very fast. It's worth giving it a try.
Ask HN: Are there any notable Chinese FLOSS projects?
4 projects | news.ycombinator.com | 11 May 2023

https://github.com/apache/doris Is a great example. Same for it's cousin https://github.com/StarRocks/starrocks that was an early fork of the doris project.
To be fair, these are the only examples I can think of and I only learned of these as I'm standing up new data infra using starrocks.
Open Source Columnar Databases
2 projects | /r/dataengineering | 17 Mar 2023

ClickHouseClickHouse and Starrocks are similar. They are both columnar databases powered by vectorization tech, which means they are really fast.
Ask HN: Do you use any software (mainly) developed in China?
3 projects | news.ycombinator.com | 27 Feb 2023

StarRocks, it’s a Linux Foundation project now, but a lot of the initial team and community behind it came from China.
https://github.com/StarRocks/starrocks
Funny that I hadn’t heard of them in the database space till they showed up at the top of ClickBench. Makes me wonder what other interesting projects I’m missing out on in China.
Anyone using StarRocks DB instead of ClickHouse?
1 project | /r/dataengineering | 17 Nov 2022
Show HN: A benchmark for analytical databases (Snowflake, Druid, Redshift)
11 projects | news.ycombinator.com | 13 Jul 2022

Full disclosure - I work for StarRocks (starrocks.com)
First of all, this is great. Transparent and healthy competition is always great for the customers!
Regarding the joined table queries that are missing in the tests, this is exactly why we built StarRocks - to give people the best performance of complex analytics queries on both joined tables and single tables.
I encourage you to checkout this blog: https://starrocks.medium.com/starrocks-outperforms-clickhous...
And, give us a star if you think we are doing the right thing: https://github.com/StarRocks/starrocks
Follow us on LinkedIn for the latest updates: https://www.linkedin.com/company/starrocks
We are looking for a very fast database for big data analysis, does anyone know about starrocks, I heard it is very fast
1 project | /r/programming | 22 Dec 2021
wow, i found a super fast database for Big Data analytics,it's called StarRocks,come and take a look!
1 project | /r/programming | 22 Dec 2021

duckdb

Posts with mentions or reviews of duckdb. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-06.

🪄 DuckDB sql hack : get things SORTED w/ constraint CHECK
1 project | dev.to | 4 Apr 2024
DuckDB: Move to push-based execution model (2021)
1 project | news.ycombinator.com | 15 Mar 2024
DuckDB performance improvements with the latest release
8 projects | news.ycombinator.com | 6 Nov 2023

I'm not sure if the fix is reassuring or not: https://github.com/duckdb/duckdb/pull/9411/files
Building a Distributed Data Warehouse Without Data Lakes
3 projects | news.ycombinator.com | 2 Nov 2023

It's an interesting question!
The problem is that the data is spread everywhere - no choice about that. So with that in mind, how do you query that data? Today, the idea is that you HAVE to put it into a central location. With tools like Bacalhau[1] and DuckDB [2], you no longer have to - a single query can be sharded amongst all your data - EFFECTIVELY giving you a lot of what you want from a data lake.
It's not a replacement, but if you can do a few of these items WITHOUT moving the data, you will be able to see really significant cost and time savings.
[1] https://github.com/bacalhau-project/bacalhau
[2] https://github.com/duckdb/duckdb
DuckDB 0.9.0
3 projects | news.ycombinator.com | 26 Sep 2023
Push or Pull, is this a question?
2 projects | dev.to | 9 Aug 2023

[4] Switch to Push-Based Execution Model by Mytherin · Pull Request #2393 · duckdb/duckdb (github.com)
Show HN: Hydra 1.0 – open-source column-oriented Postgres
12 projects | news.ycombinator.com | 3 Aug 2023

it depends on your query obviously.
In general, I did very deep benchmarking of pg, clickhouse and duckdb, and I sure didn't make stupid mistakes like this: https://news.ycombinator.com/item?id=36990831
My dataset has 50B rows and 2tb of data, and I think columnar dbs are very overhiped and I chose pg because:
- pg performance is acceptable, maybe 2-3x times slower than clickhouse and duckdb on some queries if pg is configured correctly and run on compressed storage
- clickhouse and duckdb start falling apart very fast because they specialized on very narrow type of queries: https://github.com/ClickHouse/ClickHouse/issues/47520 https://github.com/ClickHouse/ClickHouse/issues/47521 https://github.com/duckdb/duckdb/discussions/6696
🦆 Effortless Data Quality w/duckdb on GitHub ♾️
3 projects | dev.to | 25 Jul 2023

This action installs duckdb with the version provided in input.
Using SQL inside Python pipelines with Duckdb, Glaredb (and others?)
6 projects | /r/dataengineering | 30 Jun 2023

Duckdb: https://github.com/duckdb/duckdb - seems pretty popular, been keeping an eye on this for close to a year now.
CSV or Parquet File Format
3 projects | /r/Python | 1 Jun 2023

The Parquet-Go library is very complex, not yet success to use it. So I ask whether DuckDB can provide API https://github.com/duckdb/duckdb/issues/7776

What are some alternatives?

When comparing starrocks and duckdb you can also consider the following projects:

ClickBench - ClickBench: a Benchmark For Analytical Databases

ClickHouse - ClickHouse® is a free analytics DBMS for big data

doris - Apache Doris is an easy-to-use, high performance and unified analytics database.

sqlite-worker - A simple, and persistent, SQLite database for Web and Workers.

TablePlus - TablePlus macOS issue tracker

datasette - An open source multi-tool for exploring and publishing data

clickhouse-bulk - Collects many small inserts to ClickHouse and send in big inserts

octosql - OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

LakeSoul - LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

metabase-clickhouse-driver - ClickHouse database driver for the Metabase business intelligence front-end

datafusion - Apache DataFusion SQL Query Engine

starrocks vs ClickBench duckdb vs ClickHouse starrocks vs doris duckdb vs sqlite-worker starrocks vs TablePlus duckdb vs datasette starrocks vs clickhouse-bulk duckdb vs octosql starrocks vs LakeSoul duckdb vs metabase-clickhouse-driver starrocks vs ClickHouse duckdb vs datafusion

Compare starrocks vs duckdb and see what are their differences.

starrocks

duckdb

starrocks

duckdb

What are some alternatives?