S3 Express Is All You Need

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • napkin-math

    Techniques and numbers for estimating system's performance from first-principles

  • Most production storage systems/databases built on top of S3 spend a significant amount of effort building an SSD/memory caching tier to make them performant enough for production (e.g. on top of RocksDB). But it's not easy to keep it in sync with blob...

    Even with the cache, the cold query latency lower-bound to S3 is subject to ~50ms roundtrips [0]. To build a performant system, you have to tightly control roundtrips. S3 Express changes that equation dramatically, as S3 Express approaches HDD random read speeds (single-digit ms), so we can build production systems that don't need an SSD cache—just the zero-copy, deserialized in-memory cache.

    Many systems will probably continue to have an SSD cache (~100 us random reads), but now MVPs can be built without it, and cold query latency goes down dramatically. That's a big deal

    We're currently building a vector database on top of object storage, so this is extremely timely for us... I hope GCS ships this ASAP. [1]

    [0]: https://github.com/sirupsen/napkin-math

  • shim

    The Userify Shim (cloud agent) (by userify)

  • That's exactly how Userify[0] used to work. (when it was Python; now that it's a Go app, we do the caching in memory using Ristretto[1]).

    0. https://userify.com (team ssh key management/sudo authz)

    1. https://github.com/dgraph-io/ristretto

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ristretto

    A high performance memory-bound Go cache

  • That's exactly how Userify[0] used to work. (when it was Python; now that it's a Go app, we do the caching in memory using Ristretto[1]).

    0. https://userify.com (team ssh key management/sudo authz)

    1. https://github.com/dgraph-io/ristretto

  • quickwit

    Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.

  • We tested S3 Express for our search engine quickwit[0] a couple of weeks ago.

    While this was really satisfying on the performance side, we were a bit disappointed by the price, and I mostly agree with the article on this matter.

    I can see some very specific use cases where the pricing should be OK but currently, I would say most of our users should just stay on the classic S3 and add some local SSD caching if they have a lot of requests.

    [0] https://github.com/quickwit-oss/quickwit/

  • sccache

    Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage environments, including various cloud storage options, or alternatively, in local storage.

  • I'm going to set up sccache [0] to use it tomorrow. We use MSVC, so EFS is off the cards.

    [0] https://github.com/mozilla/sccache/blob/main/docs/S3.md

  • mountpoint-s3

    A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.

  • Looks like support for S3 Express was merged in with version 1.30 just a few hours ago https://github.com/awslabs/mountpoint-s3/pull/642

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts