marmot vs seafowl

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

marmot		seafowl
	Project
33	Mentions	11
1,628	Stars	355
-	Growth	2.0%
8.6	Activity	9.3
3 months ago	Latest Commit	2 days ago
Go	Language	Rust
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

marmot

Posts with mentions or reviews of marmot. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-11.

Distributed SQLite: Paradigm shift or hype?
1 project | news.ycombinator.com | 9 Apr 2024

If you're willing to accept eventual consistency (a big ask, but acceptable in some scenarios) then there are options like marmot [1] that replicate cdc over nats.
[1]: https://github.com/maxpert/marmot
Marmot: Multi-writer distributed SQLite based on NATS
1 project | /r/hypeurls | 11 Dec 2023

4 projects | news.ycombinator.com | 11 Dec 2023
Why you should probably be using SQLite
8 projects | news.ycombinator.com | 27 Oct 2023
The Raft Consensus Algorithm
5 projects | news.ycombinator.com | 3 Sep 2023

I've written a whole SQLite replication system that works on top of RAFT ( https://github.com/maxpert/marmot ). Best part is RAFT has a well understood and strong library ecosystem as well. I started of with libraries and when I noticed I am reimplementing distributed streams, I just took off the shelf implementation (https://docs.nats.io/nats-concepts/jetstream) and embedded it in system. I love the simplicity and reasoning that comes with RAFT. However I am playing with epaxos these days (https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf), because then I can truly decentralize the implementation for truly masterless implementation. Right now I've added sharding mechanism on various streams so that in high load cases masters can be distributed across nodes too.
SQLedge: Replicate Postgres to SQLite on the Edge
9 projects | news.ycombinator.com | 9 Aug 2023

Very interesting! I have question ( out of my experience in https://github.com/maxpert/marmot ) how do get around the boot time, specially when a change log of table is pretty large in Postgres? I've implemented snapshotting mechanism in Marmot as part of quickly getting up to speed. At some level I wonder if we can just feed this PG replication log into NATS cluster and Marmot can just replicate it across the board.
Show HN: Blueprint for a distributed multi-region IAM with Go and CockroachDB
4 projects | news.ycombinator.com | 8 Aug 2023

One of the reasons I started writing Marmot (https://maxpert.github.io/marmot/) was for replicating bunch of tables across regions that were read heavy. I even used it for cache replication (because who cares if it’s a cache miss, but a hit will save me time and money). It’s hard to make such blue prints in early days of product, and by the time you hit a true growth almost everyone builds a custom solution for multi-region IAM.
Stalwart All-in-One Mail Server (IMAP, JMAP, SMTP)
4 projects | news.ycombinator.com | 18 Jul 2023

Amazing I was just looking for a good mail server to configure for my demo. Which reminds me since you folks have mentioned LiteStream, have you tried Marmot (https://github.com/maxpert/marmot); I recently configured Isso with Marmot to scale it out horizontally (https://maxpert.github.io/marmot/demo). I am super curious what kind of write workload on a sub thousand people organization will have and if Marmot can help scale it horizontally without Foundation DB. I always find the the convenience of SQLite amazing.
Marmot: A distributed SQLite replicator built on top of NATS
1 project | news.ycombinator.com | 5 Jul 2023
LiteFS Cloud: Distributed SQLite with Managed Backups
9 projects | news.ycombinator.com | 5 Jul 2023

Great that you brought it up. I will fill in the perspective of what I am doing for solving this in Marmot (https://github.com/maxpert/marmot). Today Marmot already records changes via installing triggers to record changes of a table, hence all the offline changes (while Marmot is not running) are never lost. Today when Marmot comes up after a long offline (depending upon max_log_size configuration), it realizes that and tries to catch up changes via restoring a snapshot and then applying rest of logs from NATS (JetStream) change logs. I am working on change that will be publishing those change logs to NATS before it restores snapshots, and once it reapplies those changes after restoring snapshot everyone will have your changes + your DB will be up to date. Now in this case one of the things that bothers people is the fact that if two nodes coming up with conflicting rows the last writer wins.
For that I am also exploring on SQLite-Y-CRDT (https://github.com/maxpert/sqlite-y-crdt) which can help me treat each row as document, and then try to merge them. I personally think CRDT gets harder to reason sometimes, and might not be explainable to an entry level developers. Usually when something is hard to reason and explain, I prefer sticking to simplicity. People IMO will be much more comfortable knowing they can't use auto incrementing IDs for particular tables (because two independent nodes can increment counter to same values) vs here is a magical way to merge that will mess up your data.

seafowl

Posts with mentions or reviews of seafowl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-09-06.

Gcsfuse: A user-space file system for interacting with Google Cloud Storage
15 projects | news.ycombinator.com | 6 Sep 2023

In case you're interested in scale-to-zero database hosting, a few months ago I paired gcsfuse with Seafowl [0][1], an early stage open source database written in Rust. Was a lot of fun balancing tradeoffs that are usually not possible with classical databases e.g. Postgres. Thank you gcsfuse contributors.
[0] https://seafowl.io
DuckDB 0.8.0
5 projects | news.ycombinator.com | 17 May 2023

> why someone would start something in a memory unsafe language these days
You might like what we (Splitgraph) are building with Seafowl [0], a new database which is written in Rust and based on Datafusion and delta-rs [1]. It's optimized for running at the edge and responding to queries via HTTP with cache-friendly semantics.
[0] https://seafowl.io
[1] https://www.splitgraph.com/blog/seafowl-delta-storage-layer
We made a newsfeed for tracking new and deleted datasets across 200+ open data portals (and they're all queryable with SQL)
2 projects | /r/datasets | 13 Apr 2023

For example, here's the IPInfo dataset, and here's a some commodities data from Trase which is proxying to their live Postgres database, and powering their interactive dashboard. Also, here's the repository of Socrata metadata powering the newsfeed - we scrape it nightly and then push it to Seafowl, our new open-source database optimized for running cache-friendly queries "at the edge." The code for Open Data Monitor is on GitHub, if you're curious.
Quicker Serverless Postgres Connections
1 project | news.ycombinator.com | 28 Mar 2023

This is basically how we do authentication in the Splitgraph DDN [0], which is kind of like a multi-tenant serverless Postgres.
We implement the Postgres frontend with a forked version of PgBouncer, and we changed the authentication method such that when the user authenticates, we issue them a JWT which we store as a session variable. That session variable has the same security properties as a cookie in a web browser (the user can change/manipulate it, but if it's signed by us we can trust its claims).
That's the simple explanation that skips over the multi-tenant part. I don't want to derail from the thread - Neon is very cool, and we are actually experimenting with it right now, for storing the Seafowl [1] catalog when deploying to "scale to zero" services like Google Cloud Run or AWS Lambda, which don't have persistent storage.
[0] https://www.splitgraph.com/connect/query
[1] https://seafowl.io
Show HN: Free IP to Country and ASN Downloads from Ipinfo.io
1 project | news.ycombinator.com | 1 Mar 2023

This is really cool! I've always found IP data to be a compelling example of a data product, especially when talking about Splitgraph, a company of which I'm a co-founder (and btw - I also met my co-founder on HN!).
So, I exported the CSV files for country and asn data, and then uploaded them to Splitgraph. You can see some sample queries in the readme of the repository [0]. Since Splitgraph is built on Postgres, it's possible to use all the `inet` and `cidr` tools available from Postgres, so you can make range queries easily. One sample query also demonstrates a join between the two tables, resulting in the equivalent of your combined country_asn.csv.
Another idea: We have a newer project called Seafowl [1], which is an open-source analytical database optimized for running "at the edge," with cache-friendly semantics making it ideal for querying from Web applications. We don't have a self-hosted version of this yet, but perhaps the next thing to try would be loading this data into Seafowl and querying it "at the edge" - I've been thinking about ways that we could package Seafowl along as an OpenResty module, which could allow for true "at the edge" use cases like querying IP data in your reverse proxy. (Although the .mmdb format already solves this particular problem pretty efficiently and interoperably, although I'd be curious to measure the difference).
[0] https://www.splitgraph.com/miles/ipinfo-country-asn
[1] https://seafowl.io/
I Migrated from a Postgres Cluster to Distributed SQLite with LiteFS
4 projects | news.ycombinator.com | 5 Jan 2023

You can indeed run LiteFS by yourself, without Consul, as a sidecar / wrapper around your application. We do it in our project and have a Docker Compose example at [0]. In this case, you specify a specific known leader node. We haven't tried getting it running independently with Consul to do leader election / failover.
[0] https://github.com/splitgraph/seafowl/blob/main/examples/lit...
Ask HN: Serverless SQLite or Closest DX to Cloudflare D1?
2 projects | news.ycombinator.com | 2 Jan 2023

This is the vision of what we're building at Splitgraph. [0] You might be most interested in our recent project Seafowl [1] which is an open-source analytical database optimized for running "at the edge," with cache-friendly semantics making it ideal for querying from Web applications. It's built in Rust using DataFusion and incorporates many of the lessons we've learned building the Data Delivery Network [2] for Splitgraph.
[0] https://www.splitgraph.com
[1] https://seafowl.io
[2] https://www.splitgraph.com/connect
PostgREST – Serve a RESTful API from Any Postgres Database
22 projects | news.ycombinator.com | 29 Dec 2022

> why not just accept SQL and cut out all the unnecessary mapping?
You might be interested in what we're building: Seafowl, a database designed for running analytical SQL queries straight from the user's browser, with HTTP CDN-friendly caching [0]. It's a second iteration of the Splitgraph DDN [1] which we built on top of PostgreSQL (Seafowl is much faster for this use case, since it's based on Apache DataFusion + Parquet).
The tradeoff for allowing the client to run any SQL vs a limited API is that PostgREST-style queries have a fairly predictable and low overhead, but aren't as powerful as fully-fledged SQL with aggregations, joins, window functions and CTEs, which have their uses in interactive dashboards to reduce the amount of data that has to be processed on the client.
There's also ROAPI [2] which is a read-only SQL API that you can deploy in front of a database / other data source (though in case of using databases as a data source, it's only for tables that fit in memory).
[0] https://seafowl.io/
[1] https://www.splitgraph.com/connect
[2] https://github.com/roapi/roapi
Show HN: Socrata Roulette – run random SQL on a random government dataset
1 project | news.ycombinator.com | 9 Dec 2022

It's possible! Currently this is running GROUP BY queries using Socrata's query API on the original government data portal. We're adding the ability to import data from these sources into a columnar format in the future, either into Splitgraph itself or syncing the data out into Seafowl (https://seafowl.io/) which uses Parquet and is much faster.
Technically, the ability is already there (you can add a dataset to Splitgraph and select Socrata as a source if you know the dataset ID), but it's not as turnkey as landing on a dataset page and clicking a button. More to come!
Welcome to InfluxDB IOx: InfluxData’s New Storage Engine
5 projects | news.ycombinator.com | 26 Oct 2022

Just wanted to give a shout out to Apache DataFusion[0] that IOx relies on a lot (and contributes to as well!).
It's a framework for writing query engines in Rust that takes care of a lot of heavy lifting around parsing SQL, type casting, constructing and transforming query plans and optimizing them. It's pluggable, making it easy to write custom data sources, optimizer rules, query nodes etc.
It's has very good single-node performance (there's even a way to compile it with SIMD support) and Ballista [1] extends that to build it into a distributed query engine.
Plenty of other projects use it besides IOx, including VegaFusion, ROAPI, Cube.js's preaggregation store. We're heavily using it to build Seafowl [2], an analytical database that's optimized for running SQL queries directly from the user's browser (caching, CDNs, low latency, some WASM support, all that fun stuff).
[0] https://github.com/apache/arrow-datafusion
[1] https://github.com/apache/arrow-ballista
[2] https://github.com/splitgraph/seafowl

What are some alternatives?

When comparing marmot and seafowl you can also consider the following projects:

pocketbase - Open Source realtime backend in 1 file

datafusion-ballista - Apache Arrow Ballista Distributed Query Engine

cr-sqlite - Convergent, Replicated SQLite. Multi-writer and CRDT support for SQLite

azurefs - Mount Microsoft Azure Blob Storage as local filesystem in Linux (inactive)

litefs - FUSE-based file system for replicating SQLite databases across a cluster of machines

annuaire-entreprises-sirene-api

wordpress-playground - Run WordPress in the browser via WebAssembly PHP

mindcastle.io - Massively scalable, cloud-backed distributed block device for Linux and VMs

mssql-changefeed

Prisma - Next-generation ORM for Node.js & TypeScript | PostgreSQL, MySQL, MariaDB, SQL Server, SQLite, MongoDB and CockroachDB

rqlite - The lightweight, distributed relational database built on SQLite.

Directus - The Modern Data Stack 🐰 — Directus is an instant REST+GraphQL API and intuitive no-code data collaboration app for any SQL database.

marmot vs pocketbase seafowl vs datafusion-ballista marmot vs cr-sqlite seafowl vs azurefs marmot vs litefs seafowl vs annuaire-entreprises-sirene-api marmot vs wordpress-playground seafowl vs mindcastle.io marmot vs mssql-changefeed seafowl vs Prisma marmot vs rqlite seafowl vs Directus

Compare marmot vs seafowl and see what are their differences.

marmot

seafowl

marmot

seafowl

What are some alternatives?