Launch HN: Drifting in Space (YC W22) – A server process for every user

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • stateroom

    A lightweight framework for building WebSocket-based application backends.

  • In the case of containers it gets tricky because of how it interacts with the scheduler (e.g. if a node is idle but has a bunch of paused containers that could be unpaused at any time, the scheduler has to decide how to proceed), but I love the concept. It's something I've thought a bit about in a world where the server can be compiled to WebAssembly, because it's imaginable to suspend it and serialize the memory state so that it can be sent off to storage somewhere and pulled out when the next request comes in. This was actually part of the motivation behind a library I wrote called Stateroom (https://github.com/drifting-in-space/stateroom), which creates a stateful WebSocket server as a WebAssembly module, but I haven't yet implemented the ability to freeze the state of the module between requests.

  • spawner

    Discontinued Session backend orchestrator for ambitious browser-based apps. [Moved to: https://github.com/drifting-in-space/plane]

  • Hi HN, we’re Paul and Taylor, and we’re launching Drifting in Space (https://driftingin.space). We build server software for performance-intensive browser-based applications. We make it easy to give every user of your app a dedicated server-side process, which starts when they open your application and stops when they close the tab.

    Many high-end web apps give every user a dedicated connection to a server-side process. That is how they get the low latency that you need for ambitious products like full-fledged video editing tools and IDEs. This is hard for smaller teams to recreate, though, because it takes a significant ongoing engineering investment. That’s where we come in—we make this architecture available to everyone, so you can focus on your app instead of its infrastructure. You can think of it like Heroku, except that each of your users gets their own server instance.

    I realized that something like this was needed while working on data-intensive tools at a hedge fund. I noticed that almost all new application software, whether it was built in-house or third-party SaaS, was delivered as a browser application rather than native. Although browsers are more powerful than ever, I knew from experience that industrial-scale data-heavy apps posed problems, because neither the browser or a traditional stateless server architecture could provide the compute resources needed for low-latency interaction with large datasets. I began talking about this with my friend Taylor, who had encountered similar limitations while working on data analysis and visualization tools at Datadog and Uber. We decided to team up and build a company around solving it.

    We have two products, an open source package and a managed platform. Spawner, the open source part, provides an API for web apps to spawn a session-lived process. It manages the process’s lifecycle, exposing it over HTTPS, tracking inbound connections, and shutting it down when it becomes idle (i.e. when the user closes their tab). It’s open source (MIT) and available at https://github.com/drifting-in-space/spawner.

    Jamsocket is our managed platform, which uses Spawner internally. It provides the same API, but frees you from having to deal with any cluster or network configuration to ship code. From an app developer’s point of view, using it is similar to using platforms like Netlify or Render. You stay in the web stack and never have to touch Kubernetes.

    Here's an example. Imagine you make an application for investigating fraud in a large transaction database. Users want to interactively filter, aggregate, and visualize gigabytes of transactions as a graph. Instead of sending all of the data down to the browser and doing the work there, you would put your code in a container and upload it to our platform. Then, whenever a fraud analyst opens your application, you hit an API we provide to spin up a dedicated backend for that analyst. Your browser code then opens a WebSocket connection directly to that backend, which it uses to stream data as the analyst applies filters or zooms/pans the visualization.

    We're different from most managed platforms because we give each user a dedicated process. That said, there are a few other services that do run long-lived processes for each user. Architecturally, we're most similar to Agones. Agones is targeted at games where the client can speak UDP to an arbitrary IP; we target applications that want to connect directly from browsers to a hostname over HTTPS. In the Erlang world, the OTP stack provides similar functionality, but you have to embrace Erlang/Elixir to get the benefits of it; we are entirely language-agnostic. Cloudflare Durable Objects support a form of long-lived processes, but are focused on use cases around program state synchronization rather than arbitrary high-compute/memory use cases.

    We have a usage-based billing model, similar to Heroku. We charge you for the compute you use and take a cut. Usage billing scales to zero, so it’s approachable for weekend experiments. We have not solidified a price plan yet, but we’re aiming to provide an instance capable of running VS Code (as an example) for about 10 cents an hour, fractionally metered. High-memory and high-CPU backends will cost more, and heavy users will get volume discounts. Our target customers are desktop-like SaaS apps and internal data tools.

    As mentioned, our core API is open source and available at https://github.com/drifting-in-space/spawner. The managed platform is in beta and we’re currently onboarding users from a waitlist, to make sure that we have the server capacity to scale. If you’re interested, you’re welcome to sign up for it here: https://driftingin.space.

    Have you built a similar infrastructure for your application? We’re always interested in hearing the approaches people have already taken to this problem and learning what their pain points are.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Sandstorm

    Sandstorm is a self-hostable web productivity suite. It's implemented as a security-hardened web app package manager.

  • Does the managed service actually require that each user get their own container? For some applications, particularly collaborative ones, it would make much more sense to have a container for each top-level thing that the users are collaborating on, e.g. one per document. I think Sandstorm [1] got this right with its concept of grains, and I've long wanted a tool that brought that model, a stateful container per high-level object, running arbitrary code (unlike Cloudflare Durable Objects), to the world of hosted SaaS. Speaking of Cloudflare, I'm looking forward to seeing what their edge containers can do, when that feature is eventually made public.

    [1]: https://sandstorm.io/

  • falcon

    Brushing and linking for big data (by vega)

  • Good questions!

    > Why do you need one process per user? / Wouldn't this "event loop" actually be more efficient that one user/process, as there would be less context switching cost from the OS?

    We're particularly interested in apps that are often CPU-bound, so a traditional event-loop would be blocked for long periods of time. A typical solution is to put the work into a thread, so there would still be a context switch, albeit a smaller one.

    The process-per-user approach makes the most sense when a significant amount of the data used by each user does not overlap with other users. VS Code (in client/server mode) is a good example of this -- the overhead of siloing each process is relatively low compared to the benefits it gives. We think more data-heavy apps will make the same trade-offs.

    > Can I just keep a map of (connection, thread_id) on my server, and spawn one thread per user on my own server?

    If you don't have to scale beyond one server, this approach works fine, but it makes scaling horizontally complicated because you suddenly can't just use a plain old load balancer. It's not just about routing requests to the right server; deciding which server to run the threads on becomes complicated because you ideally want to decide based on the server load of each. We started going down this path, realized we'd end up re-inventing Kubernetes, so decided to embrace it instead.

    > Could I just load up my server with many cores, and give each user a SQLite database which runs each query in its own thread? This way a multi GB database would not be loaded into RAM, the query would filter it down to a result set.

    If, for a particular use case, it's economical to keep the data ready in a database that supports the query pattern users will make, it's probably not a good fit for a session-lived backend. In database terms, where our architecture makes sense is when you need to create an index on a dataset (or subset of a dataset) during the runtime of an application. For example, if you have thousands of large parquet files in blob storage and you want a user to be able to load one and run [Falcon](https://github.com/vega/falcon)-type analysis on it.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts