roapi
litestream
roapi | litestream | |
---|---|---|
24 | 165 | |
3,080 | 9,997 | |
0.8% | - | |
6.9 | 7.5 | |
about 1 month ago | 11 days ago | |
Rust | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
roapi
- Full-fledged APIs for slowly moving datasets without writing code
-
Tuql: Automatically create a GraphQL server from a SQLite database
If your use case is read-only I suggest taking a look at roapi[1]. It supports multiple read frontends (GraphQL, SQL, REST) and many backends like SQLite, JSON, google sheets, MySQL, etc.
[1] https://github.com/roapi/roapi
- Who is using AXUM in production?
-
Ask HN: Best way to provide access to large data sets
For smaller datasets then anywhere up to a few mb which isn't so bad reasonable with an API but in theory for historic data it could be up to several gb. I've not seen datasette go that high (IIRC it's a 1000 row return limit by default).
That's what got me intrigued with Atlassians offering, as data lakes tend to be something internal to a company, not something I've ever seen offered as an interaction point to users.
I've also tested out roapi [1] which is nice if the data is in some structured format already (Parquet/JSON)
[1] https://github.com/roapi/roapi
-
"thread 'main' panicked at 'no CA certificates found'", when running application in docker container
https://github.com/roapi/roapi/issues/103?
- Roapi 0.9 release adds support for all cloud storage providers
-
SQLite-based databases on the Postgres protocol? Yes we can
Very cool and well executed project. Love the sprinkle of Rust in all the other companion projects as well :)
The ROAPI(https://github.com/roapi/roapi) project I built also happened to support a similar feature set, i.e. to expose sqlite through a variety of remote query interfaces including pg wire protocols, rest apis and graphqls.
- Using Rust to write a Data Pipeline. Thoughts. Musings.
-
PostgREST – Serve a RESTful API from Any Postgres Database
> why not just accept SQL and cut out all the unnecessary mapping?
You might be interested in what we're building: Seafowl, a database designed for running analytical SQL queries straight from the user's browser, with HTTP CDN-friendly caching [0]. It's a second iteration of the Splitgraph DDN [1] which we built on top of PostgreSQL (Seafowl is much faster for this use case, since it's based on Apache DataFusion + Parquet).
The tradeoff for allowing the client to run any SQL vs a limited API is that PostgREST-style queries have a fairly predictable and low overhead, but aren't as powerful as fully-fledged SQL with aggregations, joins, window functions and CTEs, which have their uses in interactive dashboards to reduce the amount of data that has to be processed on the client.
There's also ROAPI [2] which is a read-only SQL API that you can deploy in front of a database / other data source (though in case of using databases as a data source, it's only for tables that fit in memory).
[0] https://seafowl.io/
[1] https://www.splitgraph.com/connect
[2] https://github.com/roapi/roapi
-
Command-line data analytics made easy
It could be the NDJSON parser (DF source: [0]) or could be a variety of other factors. Looking at the ROAPI release archive [1], it doesn't ship with the definitive `columnq` binary from your comment, so it could also have something to do with compilation-time flags.
FWIW, we use the Parquet format with DataFusion and get very good speeds similar to DuckDB [2], e.g. 1.5s to run a more complex aggregation query `SELECT date_trunc('month', tpep_pickup_datetime) AS month, COUNT(*) AS total_trips, SUM(total_amount) FROM tripdata GROUP BY 1 ORDER BY 1 ASC)` on a 55M row subset of NY Taxi trip data.
[0]: https://github.com/apache/arrow-datafusion/blob/master/dataf...
[1]: https://github.com/roapi/roapi/releases/tag/roapi-v0.8.0
[2]: https://observablehq.com/@seafowl/benchmarks
litestream
-
Ask HN: SQLite in Production?
I have not, but I keep meaning to collate everything I've learned into a set of useful defaults just to remind myself what settings I should be enabling and why.
Regarding Litestream, I learned pretty much all I know from their documentation: https://litestream.io/
-
How (and why) to run SQLite in production
This presentation is focused on the use-case of vertically scaling a single server and driving everything through that app server, which is running SQLite embedded within your application process.
This is the sweet-spot for SQLite applications, but there have been explorations and advances to running SQLite across a network of app servers. LiteFS (https://fly.io/docs/litefs/), the sibling to Litestream for backups (https://litestream.io), is aimed at precisely this use-case. Similarly, Turso (https://turso.tech) is a new-ish managed database company for running SQLite in a more traditional client-server distribution.
-
SQLite3 Replication: A Wizard's Guide🧙🏽
This post intends to help you setup replication for SQLite using Litestream.
-
Ask HN: Time travel" into a SQLite database using the WAL files?
I've been messing around with litestream. It is so cool. And, I either found a bug in the -timestamp switch or don't understand it correctly.
What I want to do is time travel into my sqlite database. I'm trying to do some forensics on why my web service returned the wrong data during a production event. Unfortunately, after the event, someone deleted records from the database and I'm unsure what the data looked like and am having trouble recreating the production issue.
Litestream has this great switch: -timestamp. If you use it (AFAICT) you can time travel into your database and go back to the database state at that moment. However, it does not seem to work as I expect it to:
https://github.com/benbjohnson/litestream/issues/564
I have the entirety of the sqlite database from the production event as well. Is there a way I could cycle through the WAL files and restore the database to the point in time before the records I need were deleted?
Will someone take sqlite and compile it into the browser using WASM so I can drag a sqlite database and WAL files into it and then using a timeline slider see all the states of the database over time? :)
-
Ask HN: Are you using SQLite and Litestream in production?
We're using SQLite in production very heavily with millions of databases and fairly high operations throughput.
But we did run into some scariness around trying to use Litestream that put me off it for the time being. Litestream is really cool but it is also very much a cool hack and the risk of database corruption issues feels very real.
The scariness I ran into was related to this issue https://github.com/benbjohnson/litestream/issues/510
-
Pocketbase: Open-source back end in 1 file
Litestream is a library that allows you to easily create backups. You can probably just do analytic queries on the backup data and reduce load on your server.
https://litestream.io/
- Litestream – Disaster recovery and continuous replication for SQLite
- Litestream: Replicated SQLite with no main and little cost
-
Why you should probably be using SQLite
One possible strategy is to have one directory/file per customer which is one SQLite file. But then as the user logs in, you have to look up first what database they should be connected to.
OR somehow derive it from the user ID/username. Keeping all the customer databases in a single directory/disk and then constantly "lite streaming" to S3.
Because each user is isolated, they'll be writing to their own database. But migrations would be a pain. They will have to be rolled out to each database separately.
One upside is, you can give users the ability to take their data with them, any time. It is just a single file.
[0]. https://litestream.io/
-
Monitor your Websites and Apps using Uptime Kuma
Upstream Kuma uses a local SQLite database to store account data, configuration for services to monitor, notification settings, and more. To make sure that our data is available across redeploys, we will bundle Uptime Kuma with Litestream, a project that implements streaming replication for SQLite databases to a remote object storage provider. Effectively, this allows us to treat the local SQLite database as if it were securely stored in a remote database.
What are some alternatives?
php-parquet - PHP implementation for reading and writing Apache Parquet files/streams. NOTICE: Please migrate to https://github.com/codename-hub/php-parquet.
rqlite - The lightweight, distributed relational database built on SQLite.
qframe - Immutable data frame for Go
pocketbase - Open Source realtime backend in 1 file
materialize - The data warehouse for operational workloads.
realtime - Broadcast, Presence, and Postgres Changes via WebSockets
delta-rs - A native Rust library for Delta Lake, with bindings into Python
k8s-mediaserver-operator - Repository for k8s Mediaserver Operator project
fluvio - Lean and mean distributed stream processing system written in rust and web assembly.
sqlcipher - SQLCipher is a standalone fork of SQLite that adds 256 bit AES encryption of database files and other security features.
datasette - An open source multi-tool for exploring and publishing data
litefs - FUSE-based file system for replicating SQLite databases across a cluster of machines