Meta developer tools: Working at scale

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • sapling

    A Scalable, User-Friendly Source Control System.

  • Some of the recent tool started to come out to fix Git unfriendly UX.

    Meta's Sapling (1) is definitely one of them. But there is also `jj` (2) and `git-branchless` (3). These tools target a smaller set of workflow where there is 1 main branch inside a big repo and everything else are short-lived branch/topic that could be treated as ephemeral stack of patches, constantly being uproot / rebase on top of the main branch to derive final result.

    If that's the workflow you use daily, then you should give these tools a try.

    (1): https://github.com/facebook/sapling

  • jj

    A Git-compatible VCS that is both simple and powerful

  • > Finally, Google is also moving away from Mercurial. Perhaps the JJ developers want to respond why (and check out Jujitsu -- it's a novel and very interesting system y'all should check out).

    Sure. I presented about that at Git Merge 2022. https://github.com/martinvonz/jj#disclaimer has links to the slides and the recording from there. I'll summarize the problems we have with Mercurial here:

    1. Performance/scalability. That's partly because Python is slow. Both the Mercurial project and Meta have rewritten many parts of it in C and Rust as a result. Maybe more importantly, there are many assumptions in Mercurial's design that don't scale well. We have extensions for downloading only a slice of the repo. We slice it in both file space and in version space. However, it can get very expensive to change afterwards. For example, checking out an old revision that they user hasn't previously downloaded is very slow (it requires rewriting all local revisions after that point).

    2. Consistency. Mercurial was designed for local file systems, so when we store repos in our distributed file system, we run into write races that can corrupt repos.

    3. Integrations. We integrate with Mercurial by running the `hg` binary and parsing the output. That's unnecessarily complicated and slow.

    We also see several opportunities by switching to jj (in addition to hopefully fixing the problems above):

    1. Simpler workflows. Things like: working-copy commit (no "dirty working copy" errors, for example), undo, first-class conflicts (no interrupted rebases, for example). See the GitHub project for details.

    2. Cloud-based repos. The repos will be stored in a database instead of being stored in files on top of a distributed file system. That makes them much easier for our server to work with, and it opens up for many kinds of integrations that were not feasible before.

    3. Simpler architecture. We designed jj from the beginning to be easy to integrate with our internal systems, so there should be much fewer workarounds.

    4. Simpler code base. You can typically add a command without worrying about concurrent commands, a dirty working copy, or conflicts. An example I like to mention is how I spent about two weeks trying to implement a command for amending into an ancestor commit in Mercurial. Then I implemented a more powerful version of that (can move changes from any commit to any other commit) in an hour in jj.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • git-branchless

    High-velocity, monorepo-scale workflow for Git

  • xmeta2external

    lookup table of similar technology and services to help ex-Meta mates survive the *real* world

  • As a Meta employee as well, and working in the data analytics / engineering space, I'm finding tooling to be pretty high standards actually.

    We rely oon a lot of Apache products, and whatever we use that's internal only is great to work with. Daiq** recently started supporting notebooks, which has been a game changer for us as well as for the teams we work with.

    For those who are interested, a former DE made this nice repo that maps internal tools against "real world" products: https://github.com/thijsessens/xmeta2external

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts