SimpleDB: A Basic RDBMS Built from Scratch

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • toydb

    Distributed SQL database in Rust, written as a learning project

  • bustub

    The BusTub Relational Database Management System (Educational)

  • There is also BusTub from CMU which I stumbled upon earlier today:

    https://github.com/cmu-db/bustub

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • simple-db-hw-2021

  • the official github for this course - https://github.com/MIT-DB-Class/simple-db-hw-2021 - specifically asks developers not to make their implementations public.

  • simpledb

    A simple database built from scratch that has some the basic RDBMS features (SQL query parser, transactions, query optimizer)

  • OP I am curious about the project license. Since it is built on top of the assignment code, can you use your own copyright? - https://github.com/awelm/simpledb/blob/2e78bb2/LICENSE

  • prql

    PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement

  • Great questions! I'm not a database expert either but I can try answering these:

    1) I think databases like to manage pages directly because the db can make more optimizations than the OS because the db has more context. For example, when aborting a transaction the db knows its dirty pages should be evicted (i'm not sure if mmap offers custom eviction). Also I believe if the db uses mmap, it loses control over when pages are flushed to disk. Flush control is necessary for guaranteeing transaction durability.

    2) What you're describing here sounds similar to a LSM-tree database (e.g. RocksDB). They are used often for write-heavy workloads because writes are just appends, but they might not be great for read-heavy things.

    3) This reminds me of PRQL[1] (which was trending on Hacker News last week) and Spark SQL. I'm not too familiar with this area though, so I can't really say why SQL was designed this way.

    [1] https://github.com/max-sixty/prql?utm_source=hackernewslette...

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts