LiteFS Cloud: Distributed SQLite with Managed Backups

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • marmot

    A distributed SQLite replicator built on top of NATS

  • Great that you brought it up. I will fill in the perspective of what I am doing for solving this in Marmot (https://github.com/maxpert/marmot). Today Marmot already records changes via installing triggers to record changes of a table, hence all the offline changes (while Marmot is not running) are never lost. Today when Marmot comes up after a long offline (depending upon max_log_size configuration), it realizes that and tries to catch up changes via restoring a snapshot and then applying rest of logs from NATS (JetStream) change logs. I am working on change that will be publishing those change logs to NATS before it restores snapshots, and once it reapplies those changes after restoring snapshot everyone will have your changes + your DB will be up to date. Now in this case one of the things that bothers people is the fact that if two nodes coming up with conflicting rows the last writer wins.

    For that I am also exploring on SQLite-Y-CRDT (https://github.com/maxpert/sqlite-y-crdt) which can help me treat each row as document, and then try to merge them. I personally think CRDT gets harder to reason sometimes, and might not be explainable to an entry level developers. Usually when something is hard to reason and explain, I prefer sticking to simplicity. People IMO will be much more comfortable knowing they can't use auto incrementing IDs for particular tables (because two independent nodes can increment counter to same values) vs here is a magical way to merge that will mess up your data.

  • sqld

    Discontinued LibSQL with extended capabilities like HTTP protocol, replication, and more.

  • There's https://github.com/libsql/sqld , but sqlite's concurrency model doesn't always work well with long-lived transactions (and just the network hop can be slower than a local transaction), especially if you want to write.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • litefs

    FUSE-based file system for replicating SQLite databases across a cluster of machines

  • LiteFS works sorta like that. It provides read replicas on all your application servers so you can use it just like vanilla SQLite for queries.

    Write transactions have to occur on the primary node but that's mostly because of latency. SQLite operates in serializable isolation so it only allows one transaction at a time. If you wanted to have all nodes write then you'd need to acquire a lock on one node and then update it and then release the lock. We actually allow this on LiteFS using something called "write forwarding" but it's pretty slow so I wouldn't suggest it for regular use.

    We're adding an optional a query API over HTTP [1] soon as well. It's inspired by Turso's approach. That'll let you issue one or more queries in a batch over HTTP and they'll be run in a single transaction.

    [1]: https://github.com/superfly/litefs/issues/326

  • rqlite

    The lightweight, distributed relational database built on SQLite.

  • mycelite

    Mycelite is a SQLite extension that allows you to synchronize changes from one instance of SQLite to another.

  • donutdb

    Store and query a sqlite db directly backed by DynamoDB.

  • Man this is cool. While I really enjoy my own solution of using a custom SQLite vfs that stores your db transparently in dynamodb[0], this really is a compelling alternative.

    I wonder how viable this would be to use from aws lambda? It seems like the way lambda does concurrency probably doesn't play all that well with litefs. Maybe it's time to move some workloads over to fly.io.

    [0]: https://github.com/psanford/donutdb

  • litestack

  • I’m working on this for Rails apps at https://github.com/oldmoe/litestack/pull/12

    The idea is that people with small-to-medium size Rails Turbo apps should be able to deploy them without needing Redis or Postgres.

    I’ve gotten as far as deploying this stack _without_ LiteFS and it works great. The only downside is the application queues requests on deploy, but for some smaller apps it’s acceptable to have the client wait for a few seconds while the app restarts.

    When I get that PR merged I’ll write about how it works on Fly and publish it to https://fly.io/ruby-dispatch/.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • sqlite-y-crdt

    Y-CRDT extension for SQLite

  • Great that you brought it up. I will fill in the perspective of what I am doing for solving this in Marmot (https://github.com/maxpert/marmot). Today Marmot already records changes via installing triggers to record changes of a table, hence all the offline changes (while Marmot is not running) are never lost. Today when Marmot comes up after a long offline (depending upon max_log_size configuration), it realizes that and tries to catch up changes via restoring a snapshot and then applying rest of logs from NATS (JetStream) change logs. I am working on change that will be publishing those change logs to NATS before it restores snapshots, and once it reapplies those changes after restoring snapshot everyone will have your changes + your DB will be up to date. Now in this case one of the things that bothers people is the fact that if two nodes coming up with conflicting rows the last writer wins.

    For that I am also exploring on SQLite-Y-CRDT (https://github.com/maxpert/sqlite-y-crdt) which can help me treat each row as document, and then try to merge them. I personally think CRDT gets harder to reason sometimes, and might not be explainable to an entry level developers. Usually when something is hard to reason and explain, I prefer sticking to simplicity. People IMO will be much more comfortable knowing they can't use auto incrementing IDs for particular tables (because two independent nodes can increment counter to same values) vs here is a magical way to merge that will mess up your data.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts