noisepage
bustub
noisepage | bustub | |
---|---|---|
4 | 13 | |
1,677 | 3,677 | |
- | 2.0% | |
0.0 | 8.5 | |
over 1 year ago | 7 days ago | |
C++ | C++ | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
noisepage
-
The Part of PostgreSQL We Hate the Most (Multi-Version Concurrency Control)
> Carne
Okay, so, noisepage appears to be open source https://github.com/cmu-db/noisepage/
But I can't find the Ottertune Github page
Is any part of Ottertune open source?
-
Rethinking Stream Processing and Streaming Databases
I was one of the main authors of a research project called Peloton (https://github.com/cmu-db/peloton) which was later rebranded to NoisePage (https://github.com/cmu-db/noisepage). The initial version of RisingWave actually borrowed a lot from Peloton (fun fact: that's also how DuckDB https://duckdb.org/ started!), but we decided to rewrite in Rust due to development cost and security (e.g., memory leakage) considerations (more info: https://www.risingwave-labs.com/blog/building-a-cloud-database-from-scratch-why-we-moved-from-cpp-to-rust/).
-
Show HN: OtterTune – Automated Database Tuning Service for RDS MySQL/Postgres
> If I may, can you please shed light on why Peloton had to be archived and in essence re-done with OtterTune. Interested in your team's learnings from it from a software engineering point of view.
Peloton and OtterTune are completely different projects. Peloton was abandoned and rewritten as NoisePage (https://noise.page). OtterTune has always been OtterTune.
See this recent interview where I discuss why we gave up on Peloton:
https://www.ibm.com/cloud/blog/database-deep-dives-with-andy...
> - How did the team ensure this project doesn't suffer from the same disadvantages as its predecessor?
Again, different projects. OtterTune is all about not having to modify the internals of Postgres, MySQL, and any other DBMS. This is why we were able to support Oracle in the academic version in a short amount of time:
https://ottertune.com/blog/vldb-autonomous-database-tuning-i...
> - What would you advise other teams undertaking a rewrite to pay off their tech debts?
It is hard for to provide general advice for this question because every situation is different.
> How does this project compare to / contrast with Google's and SingleStore's efforts in this space?
I am not familiar with Google or SingleStore using ML in the manner that we are with OtterTune to tune configuration knobs. Or at least I have not seen anything public about it.
These days Oracle is the most aggressive with pushing automated tuning capabilities (Oracle's autonomous DBaaS, AutoPilot for MySQL Heatwave). The difference with these approaches and OtterTune is that right now we are focused on configuration tuning (to avoid data privacy issues) and our core approach is platform/DBMS agnostic.
> Any chance we see you do a Peter Bailis and Sisu Data this? (:
I don't know what you mean by this? Peter Bailis is the Ryan Gosling of databases.
-
Resumable Allocator?
The state of the art for this sort of thing is (Leanstore/Umbra - https://umbra-db.com/) or the new NoisePage database (https://github.com/cmu-db/noisepage/tree/master/src/storage). There is also the HyRise database, but that one focuses more on datasets that fit entirely in memory (https://hpi.de/plattner/projects/hyrise.html)
bustub
-
Can we create a thread for some of the best materials on CS available online?
Introduction to Computing"
https://dcic-world.org/
# Programming Language Theory:
"Programming Languages: Application and Interpretation"
https://www.plai.org/
# Compilation:
"Essentials of Compilation: An Incremental Approach in Python"
https://github.com/IUCompilerCourse/Essentials-of-Compilatio...
# Database Systems:
"CMU: Intro to Database Systems"
https://15445.courses.cs.cmu.edu/
"CMU: Advanced Database Systems"
https://15721.courses.cs.cmu.edu/
# Calculus I/II & Real Analysis
"A Course in Calculus and Real Analysis"
https://link.springer.com/book/10.1007/978-3-030-01400-1
"A Course in Multivariable Calculus and Analysis"
https://link.springer.com/book/10.1007/978-1-4419-1621-1
# Linear Algebra & ML:
* A Series of books by prof. Joe Suzuki without using any external library for the implementations *
"Statistical Learning with Math and Python"
https://link.springer.com/book/10.1007/978-981-15-7877-9
"Sparse Estimation with Math and Python"
https://link.springer.com/book/10.1007/978-981-16-1438-5
"Kernel Methods for Machine Learning with Math and Python"
https://link.springer.com/book/10.1007/978-981-19-0401-1
# Discrete Mathematics:
"CMU 21-228 Discrete Mathematics (prof. Poh-Shen Loh"
https://www.math.cmu.edu/~ploh/2021-228.shtml
# Cryptography:
"Serious Cryptography: A Practical Introduction to Modern Encryption"
https://nostarch.com/seriouscrypto
# Problem Solving:
"Math 235: Mathematical Problem Solving"
https://www.cip.ifi.lmu.de/~grinberg/t/20f/
-
const/smart pointer confusions
The relevant classes are: https://github.com/cmu-db/bustub/blob/master/src/primer/trie.cpp and the header https://github.com/cmu-db/bustub/blob/master/src/include/primer/trie.h (you can look at the root github's repo README how to compile)
-
Any DSA resources that are NOT boring?
Take for example CMU's bustub DB. Great lecture material, but their own pedagogical database where you implement parts of the database.
-
The “Build Your Own Database” book is finished
This seems like a fairly shallow course: if you’re interested in some real awesome database hacking, I highly recommend bustub. It’s great and educational.
- 15-445 Projects source code
-
What's everyone working on this week (9/2023)?
Not a tutorial but I completed all the assignments for CMU Database System course (link) and watched all their youtube videos before I started it (I highly recommend it, it's a great course and it's possible to submit the solutions even if you're not a CMU student. The entry code to gradescope is in the FAQ). Though, what I do is not re-writing bustub in Rust, as bustub uses 2 phase locking to achieve transaction isolation, and this uses MVCC, pretty much like Postgres (though currently much simpler). I used this resource as a starting point how it works.
- The BusTub Relational Database Management System (Educational)
-
SimpleDB: A Basic RDBMS Built from Scratch
There is also BusTub from CMU which I stumbled upon earlier today:
https://github.com/cmu-db/bustub
-
Online courses to learn more about databases and the concepts taught in Week 7?
check this course from cmu
- C++ Project Ideas
What are some alternatives?
openstack-ansible-os_trove - Role os_trove for OpenStack-Ansible. Mirror of code maintained at opendev.org.
prql - PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
sled - the champagne of beta embedded databases
ClickHouse - ClickHouse® is a free analytics DBMS for big data
LevelDB - LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
toydb - Distributed SQL database in Rust, written as a learning project
MongoDB - The MongoDB Database
dbdoc - Document your database schema, because your team will thank you, and a single text file makes it easy. Works well with PostgreSQL and others.
RocksDB - A library that provides an embeddable, persistent key-value store for fast storage.