bustub VS RocksDB

Compare bustub vs RocksDB and see what are their differences.

bustub

The BusTub Relational Database Management System (Educational) (by cmu-db)

RocksDB

A library that provides an embeddable, persistent key-value store for fast storage. (by facebook)
Our great sponsors
  • Nanos - Run Linux Software Faster and Safer than Linux with Unikernels
  • Scout APM - A developer's best friend. Try free for 14-days
  • SaaSHub - Software Alternatives and Reviews
bustub RocksDB
4 11
921 21,292
6.0% 1.2%
7.1 9.8
6 days ago 4 days ago
C++ C++
MIT License GNU General Public License v3.0 or later
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

bustub

Posts with mentions or reviews of bustub. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-01-28.

RocksDB

Posts with mentions or reviews of RocksDB. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-08-07.
  • Distributed SQL Essentials: Sharding and Partitioning in YugabyteDB
    1 project | dev.to | 21 Nov 2021
    The SST files store the key-value pairs for tables and indexes. Sharding is the right term here because each tablet is a database (based on RocksDB), with its own protection. This looks like the sharded databases we described above, except that they are not SQL databases but key-value document stores. They have all the required features for a reliable datastore, with transactions and strong consistency. However, they don’t have the burden of managing them as multiple databases because the SQL layer is above. Joins and secondary indexes are not processed at this level because this prevents cross-shard transactions.
  • Hello guys , needed help for building a key-value data store
    1 project | reddit.com/r/Database | 9 Oct 2021
    - RocksDB - kv store that uses LSM tree;
  • We built an open-source SQL DB for Intel SGX enclaves
    3 projects | reddit.com/r/cybersecurity | 7 Aug 2021
    Hi everyone! Our team just released EdgelessDB, an open-source database built on MariaDB that runs completely inside Intel SGX enclaves. As storage engine, it uses RocksDB with a custom encryption engine. The engine uses AES-GCM and is optimized for RocksDB’s specific SST file layout and the enclave environment. It has some nice properties like global confidentiality and verifiability and it considers strong attackers like malicious admins or rootkits. It also delivers rather low overheads (<10% for the TPC-C benchmark on Azure). In short: all data is only ever decrypted inside the enclave. This is different from other databases, where data and corresponding keys are processed in the clear in memory. We believe this is useful because (1) it’s very secure and (2) it enables some interesting use cases, like secure data pooling between parties. If you’re interested in trying it out: here’s a quickstart guide. In essence, you can run the Docker image with a single command on any recent Intel Xeon with SGX. Code and more info can be found on GitHub. Would be great to get your feedback on this :-)
  • Apache Hudi - The Streaming Data Lake Platform
    8 projects | dev.to | 27 Jul 2021
    Hudi tables can be used as sinks for Spark/Flink pipelines and the Hudi writing path provides several enhanced capabilities over file writing done by vanilla parquet/avro sinks. Hudi classifies write operations carefully into incremental (insert, upsert, delete) and batch/bulk operations (insert_overwrite, insert_overwrite_table, delete_partition, bulk_insert) and provides relevant functionality for each operation in a performant and cohesive way. Both upsert and delete operations automatically handle merging of records with the same key in the input stream (say, a CDC stream obtained from upstream table) and then lookup the index, finally invoke a bin packing algorithm to pack data into files, while respecting a pre-configured target file size. An insert operation on the other hand, is intelligent enough to avoid the precombining and index lookup, while retaining the benefits of the rest of the pipeline. Similarly, bulk_insert operation provides several sort modes for controlling initial file sizes and file counts, when importing data from an external table to Hudi. The other batch write operations provide MVCC based implementations of typical overwrite semantics used in batch data pipelines, while retaining all the transactional and incremental processing capabilities, making it seamless to switch between incremental pipelines for regular runs and batch pipelines for backfilling/dropping older partitions. The write pipeline also contains lower layers optimizations around handling large merges by spilling to rocksDB or an external spillable map, multi-threaded/concurrent I/O to improve write performance.
    8 projects | dev.to | 27 Jul 2021
    There is a fundamental tradeoff today in data lakes between faster writing and great query performance. Faster writing typically involves writing smaller files (and later clustering them) or logging deltas (and later merging on read). While this provides good performance already, the pursuit of great query performance often warrants opening fewer number of files/objects on lake storage and may be pre-materializing the merges between base and delta logs. After all, most databases employ a buffer pool or block cache, to amortize the cost of accessing storage. Hudi already contains several design elements that are conducive for building a caching tier (write-through or even just populated by an incremental query), that will be multi-tenant and can cache pre-merged images of the latest file slices, consistent with the timeline. Hudi timeline can be used to simply communicate caching policies, just like how we perform inter table service co-ordination. Historically, caching has been done closer to the query engines or via intermediate in-memory file systems. By placing a caching tier closer and more tightly integrated with a transactional lake storage like Hudi, all query engines would be able to share and amortize the cost of the cache, while supporting updates/deletes as well. We look forward to building a buffer pool for the lake that works across all major engines, with the contributions from the rest of the community.
  • Ribbon filter: Practically smaller than Bloom and Xor
    2 projects | news.ycombinator.com | 11 Jul 2021
  • How to configure kafka so that it can pick up and uses latest RocksDB for Kafka Streams
    2 projects | reddit.com/r/Kafka | 13 Jun 2021
    As per the (https://github.com/facebook/rocksdb/pull/7714) PR, I m seeing that the RocksDB has fixed this issue from their end. Can anyone please tell me how to update or let kafka streams use a later version of rocks db. Please let me know if this is the correct approach or should I do something different.
  • Just finished a Biff rewrite (batteries-included web framework)
    3 projects | reddit.com/r/Clojure | 18 May 2021
    This is an issue in rocksdb's RocksJava lib, they sorted out M1 support quickly in the main rocks libs but the java lib still has an open issue. https://github.com/facebook/rocksdb/issues/7720
  • Has anyone managed to upgrade to v16 with Rook?
    1 project | reddit.com/r/ceph | 7 May 2021
    Fix RocksDB SIGILL error on Raspberry PI 4
  • Nano full node on Mac M1
    2 projects | reddit.com/r/nanocurrency | 26 Feb 2021
    There's an issue with RocksDB, and some other irritating Cmake things I forgot in the last week. This needs to get merged, for one: https://github.com/facebook/rocksdb/pull/7714

What are some alternatives?

When comparing bustub and RocksDB you can also consider the following projects:

LMDB - Read-only mirror of official repo on openldap.org. Issues and pull requests here are ignored. Use OpenLDAP ITS for issues.

LevelDB - LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

SQLite - Unofficial git mirror of SQLite sources (see link for build instructions)

sled - the champagne of beta embedded databases

ClickHouse - ClickHouse® is a free analytics DBMS for big data

TileDB - The Universal Storage Engine

Sophia - Modern transactional key-value/row storage library.

upscaledb - A very fast lightweight embedded database engine with a built-in query language.

cpp_redis

Bedrock - Rock solid distributed database specializing in active/active automatic failover and WAN replication

debezium - Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Hiredis - Minimalistic C client for Redis >= 1.2