Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Thanks for taking the time to look through the architecture. There are definitely some choices that would have seemed weird to me when we set out to build this, but that we did not make lightly.
We actually initially built this on Kubernetes, twice. The MVP was Kubernetes + nginx where we created pods through the API and used the built-in DNS resolver. The post-MVP attempt fully embraced k8s, with our own CRD and operator pattern. It still exists in another branch of the repo[1].
Our decision to move off came because we realized we cared about a different set of things than Kubernetes did. For example, cold start time generally doesn’t matter that much to a stateless server architecture, but is vital for us because a user is actively waiting on each cold start. Moving away from k8s let us own the scheduling process, which helped us reduce cold start times significantly. There are other things we gain from it, some of which I’ve talked about in this comment tree[2]. I will say, it seemed like a crazy decision when I proposed it, but I have no regrets about it.
The point of sqlite was to allow the “drone” version to be updated in place without killing running backends. It also allows (but does not require) the components of the drone to run as separate containers. I originally wanted to use LLVM, but landed on sqlite. It’s a pretty lightweight dependency, it provides another point of introspection for a running system (the sqlite cli), and it’s not something people otherwise have to interact with. I wrote up my thought process for it at the time in this design doc[3].
You’re right about per-user being supported by Plane. I use per-user to convey that we treat container creation as so cheap and ephemeral you could give one to every user, but users can certainly share one and we’ve done that for exactly the data sync use case you describe.
[1] https://github.com/drifting-in-space/plane/tree/original-kub...
[2] https://news.ycombinator.com/item?id=32305234
[3] https://docs.google.com/document/d/1CSoF5Fgge_t1vY0rKQX--dWu...
Yes, for CPU bound processed on the BEAM you'll want to use a NIF (native implemented function) but that leaves you open to taking down the entire VM with bad NIF code (segfaults, infinite loops, etc). A purported safer means to create NIFs is to use Rustler (https://github.com/rusterlium/rustler) which lets you easily write NIFs in Rust instead of C. I haven't used it but I've heard good things.