barreldb
dslabs
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
barreldb
-
Build Your Own Fast, Persistent, Toy KV Store
I recently implemented the Bitcask paper in Golang and shared my learnings [on this post](https://mrkaran.dev/posts/barreldb/).
[GitHub](https://github.com/mr-karan/barreldb/) repo if interested.
Bitcask is an excellent paper that is not overwhelming to understand and offers a great stepping stone in building your own data stores. The simple yet powerful design of an append only file is eloquent and performant.
I’d love to read about more such implementations, if anyone has any recommendations.
dslabs
-
Smurf: Beyond the Test Pyramid
You'd define invariants that must be met. This has been done before.
https://en.wikipedia.org/wiki/Search-based_software_engineer...
e.g. testing implementations of Paxos: https://github.com/emichael/dslabs
-
The leadership myth in replicated databases (2023)
I recently took a Distributed Systems course, and I also thought it was very interesting and unexpected how in the most basic form of Paxos, there is no concept of node roles or hierarchy like leader/follower, master/replica, etc. The base case is that all nodes have the same replicated log, and are "writers" capable of initiating changes to the log.
This youtube video was particularly helpful in learning about consensus algorithms, specifically Paxos/MultiPaxos:
https://www.youtube.com/watch?v=JEpsBg0AO6o
John Ousterhout (author of Raft) walks through Paxos/MultiPaxos as outlined by Leslie Lamport, and then talks about a series of optimizations to improve performance. One key optimization is transitioning from performing consensus on a single log-slot proposal, to the entire log-slot altogether, which mitigates failed consensus rounds, and is where concepts like leader nodes emerge. If your familiar with Raft and Paxos, while listening to these optimization applied to MultiPaxos, you can kind of notice it to begin to resemble Raft.
The course I took was through Georgia Tech, but was largely based around a framework developed at the University of Washington called dslabs:
https://github.com/emichael/dslabs
It was super informative for my learning about the foundations of distributed systems, namely consensus algorithms. I'd highly recommend it for anyone interested in learning more. Although fair warning, the programming assignments were quite difficult and time consuming.
-
Show HN: Advent of Distributed Systems
I took a Distributed Systems course at Georgia Tech this spring, which used a learning framework from the University of Washington: https://github.com/emichael/dslabs
You make a key-value store using multiple techniques, from a basic single-node KV store, to a primary/replica, to PAXOS, to sharded PAXOS (which is essentially what AWS DynamoDB is)
There are tests to validate your implementation. I learned a ton from this, although I gave up at the last milestone because my grade was satisfactory in the class :)
-
Preparing for distributed systems in fall 2023
If you open the syllabus from OMSCS site, it says that the 5 programming assignments will be based on https://github.com/emichael/dslabs
-
anyone want to share their coding assignments?
If you want to learn about distributed systems, this project is very cool imo. https://ellismichael.com/dslabs/ Don’t put your solution in a public repository.
-
DSLabs solutions
Per the original author's repo:
-
Build Your Own Fast, Persistent, Toy KV Store
This might interest you as well: https://github.com/emichael/dslabs
That distributed systems lab is what Georgia Tech's Distributed System lab[0] is based on, at least when I took the course back in 2021
[0] - https://omscs.gatech.edu/cs-7210-distributed-computing
-
CS 7210 - Labs available
Haven't taken the course, so not sure on specifics, but the projects are generally based on dslabs, as highlighted in the syllabus for the course.
-
Gain experience in Distributed Systems
If the former, I don't want to discourage you, but good luck. In my MS we had to build an extremely simplified PAXOS implementation and it took everyone 200+ hrs and I think 1 or 2 out of 50 students ended with a fully correct implementation. Actually, that project is publicly available, here: https://github.com/emichael/dslabs
-
I want to work on distributed systems: Rust or C++?
fwiw this is used in many graduate programs around the country to teach distributed systems concepts: https://github.com/emichael/dslabs and can be done completely independently.
What are some alternatives?
whirlog - a minimal versioned log structured relational DB in Common Lisp
advanced-java - 😮 Core Interview Questions & Answers For Experienced Java(Backend) Developers | 互联网 Java 工程师进阶知识完全扫盲:涵盖高并发、分布式、高可用、微服务、海量数据处理等领域知识
bitcask - A log-structured hash table for fast key/value data
DalvDB - A distributed Key/Value storage, which uses client devices as a replica and stores each user data in a different partition