System Design: The complete course

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • system-design-primer

    Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

  • System Design Primer

  • Apache ZooKeeper

    Apache ZooKeeper

  • To solve this issue we can use a distributed system manager such as Zookeeper which can provide distributed synchronization. Zookeeper can maintain multiple ranges for our servers.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • RabbitMQ

    Open source RabbitMQ: core server and tier 1 (built-in) plugins

  • While this seems like a classic publish-subscribe use case, it is actually not as mobile devices and browsers each have their own way of handling push notifications. Usually, notifications are handled externally via Firebase Cloud Messaging (FCM) or Apple Push Notification Service (APNS) unlike message fan-out which we commonly see in backend services. We can use something like Amazon SQS or RabbitMQ to support this functionality.

  • PostgreSQL

    Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

  • We already have access to the latitude and longitude of our customers, and with databases like PostgreSQL and MySQL we can perform a query to find nearby driver locations given a latitude and longitude (X, Y) within a radius (R).

  • MySQL

    MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

  • We already have access to the latitude and longitude of our customers, and with databases like PostgreSQL and MySQL we can perform a query to find nearby driver locations given a latitude and longitude (X, Y) within a radius (R).

  • MongoDB

    The MongoDB Database

  • Since the data is not strongly relational, NoSQL databases such as Amazon DynamoDB, Apache Cassandra, or MongoDB will be a better choice here, if we do decide to use an SQL database then we can use something like Azure SQL Database or Amazon RDS.

  • GlusterFS

    Gluster Filesystem : Build your distributed storage in minutes

  • But where can we store files at scale? Well, object storage is what we're looking for. Object stores break data files up into pieces called objects. It then stores those objects in a single repository, which can be spread out across multiple networked systems. We can also use distributed file storage such as HDFS or GlusterFS.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • envoy

    Cloud-native high-performance edge/middle/service proxy

  • Service-to-service communication is essential in a distributed application but routing this communication, both within and across application clusters, becomes increasingly complex as the number of services grows. Service mesh enables managed, observable, and secure communication between individual services. It works with a service discovery protocol to detect services. Istio and envoy are some of the most commonly used service mesh technologies.

  • elasticsearch-mapper-attachments

    Discontinued Mapper Attachments Type plugin for Elasticsearch

  • Sometimes traditional DBMS are not performant enough, we need something which allows us to store, search, and analyze huge volumes of data quickly and in near real-time and give results within milliseconds. Elasticsearch can help us with this use case.

  • consul

    Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.

  • Consul

  • ArangoDB

    🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

  • For mutual friends, we can build a social graph for every user. Each node in the graph will represent a user and a directional edge will represent followers and followees. After that, we can traverse the followers of a user to find and suggest a mutual friend. This would require a graph database such as Neo4j and ArangoDB.

  • murder

    Large scale server deploys using BitTorrent and the BitTornado library (by ervinb)

  • Let's design a Twitter like social media service, similar to services like Facebook, Instagram, etc.

  • Stripe

    PHP library for the Stripe API.

  • Handling payments at scale is challenging, to simplify our system we can use a third-party payment processor like Stripe or PayPal. Once the payment is complete, the payment processor will redirect the user back to our application and we can set up a webhook to capture all the payment-related data.

  • Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

  • Recording analytics and metrics is one of our extended requirements. We can capture the data from different services and run analytics on the data using Apache Spark which is an open-source unified analytics engine for large-scale data processing. Additionally, we can store critical metadata in the views table to increase data points within our data.

  • Redis

    Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.

  • Quadtree seems perfect for our use case, we can update the Quadtree every time we receive a new location update from the driver. To reduce the load on the quadtree servers we can use an in-memory datastore such as Redis to cache the latest updates. And with the application of mapping algorithms such as the Hilbert curve, we can perform efficient range queries to find nearby drivers for the customer.

  • Neo4j

    Graphs for Everyone

  • For mutual friends, we can build a social graph for every user. Each node in the graph will represent a user and a directional edge will represent followers and followees. After that, we can traverse the followers of a user to find and suggest a mutual friend. This would require a graph database such as Neo4j and ArangoDB.

  • NATS

    High-Performance server for NATS.io, the cloud and edge native messaging system.

  • Exactly once delivery and message ordering is challenging in a distributed system, we can use a dedicated message broker such as Apache Kafka or NATS to make our notification system more robust.

  • Memcached

    memcached development tree

  • In a location services-based platform, caching is important. We have to be able to cache the recent locations of the customers and drivers for fast retrieval. We can use solutions like Redis or Memcached but what kind of cache eviction policy would best fit our needs?

  • Apache Solr

    Apache Lucene and Solr open-source search software

  • Elasticsearch is a distributed, free and open search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. It is built on top of Apache Lucene.

  • kubernetes

    Production-Grade Container Scheduling and Management

  • Containers (eg. Kubernetes, Amazon ECS)

  • ApacheKafka

    A curated re-sources list for awesome Apache Kafka

  • Push notifications are an integral part of any social media platform. We can use a message queue or a message broker such as Apache Kafka with the notification service to dispatch requests to Firebase Cloud Messaging (FCM) or Apple Push Notification Service (APNS) which will handle the delivery of the push notifications to user devices.

  • gRPC

    The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)

  • gRPC is a modern open-source high-performance Remote Procedure Call (RPC) framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking, authentication and much more.

  • foundation

    GraphQL Foundation Charter and Legal Documents (by graphql)

  • GraphQL is a query language and server-side runtime for APIs that prioritizes giving clients exactly the data they request and no more. It was developed by Facebook and later open-sourced in 2015.

  • FFmpeg

    Mirror of https://git.ffmpeg.org/ffmpeg.git

  • This results in a smaller size file and a much more optimized format for the target devices. Standalone solutions such as FFmpeg or cloud-based solutions like AWS Elemental MediaConvert can be used to implement this step of the pipeline.

  • etcd

    Distributed reliable key-value store for the most critical data of a distributed system

  • etcd

  • CouchDB

    Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability

  • Example: Apache Cassandra, CouchDB.

  • nodejs-storage

    Node.js client for Google Cloud Storage: unified object storage for developers and enterprises, from live data serving to data analytics/ML to data archiving.

  • We can use object stores like Amazon S3, Azure Blob Storage, or Google Cloud Storage for this use case.

  • nodejs-pubsub

    Node.js client for Google Cloud Pub/Sub: Ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics.

  • Google Pub/Sub

  • Apache Cassandra

    Mirror of Apache Cassandra

  • Data partitioning in Apache Cassandra.

  • auth0-java

    Java client library for the Auth0 platform

  • Auth0

  • Aerospike

    Aerospike Database Server – flash-optimized, in-memory, nosql database

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts