multiversion-concurrency-contro
cr-sqlite
multiversion-concurrency-contro | cr-sqlite | |
---|---|---|
16 | 28 | |
- | 2,434 | |
- | 3.2% | |
- | 9.6 | |
- | 8 days ago | |
Rust | ||
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
multiversion-concurrency-contro
-
CRDT-richtext: Rust implementation of Peritext and Fugue
https://github.com/samsquire/multiversion-concurrency-contro...
And I implemented a 3 way text diff with myers algorithm based on https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm...
https://github.com/samsquire/text-diff
I implemented an eventually consistent mesh protocol that uses timestamps to provide last write wins
-
A collection of lock-free data structures written in standard C++11
I think I lean towards per-thread sharding instead of mutex based or lock free data structures except for lockfree ringbuffers.
You can get embarassingly parallel performance if you split your data by thread and aggregate periodically.
If you need a consistent view of your entire set of data, that is slow path with sharding.
In my experiments with multithreaded software I simulate a bank where many bankaccounts are randomly withdrawn from and deposited to. https://github.com/samsquire/multiversion-concurrency-contro...
I get 700 million requests per second due to the sharding of money over accounts.
-
The “Build Your Own Database” book is finished
If you want some sample code to implement MVCC, I implemented MVCC in multithreaded Java as a toy example
https://github.com/samsquire/multiversion-concurrency-contro...
First read TransactionC.java then read MVCC.java
-
Let's write a setjmp
I wrote an unrolled switch statement in Java to simulate eager async/await across treads.
https://github.com/samsquire/multiversion-concurrency-contro...
The goal is that a compiler should generate this for you. This code is equivalent to the following:
task1:
-
Structured Concurrency Definition
https://doc.rust-lang.org/book/ch16-00-concurrency.html
I've been working on implementing Java async/await state machine with switch statements and a scheduling loop. If the user doesn't await the async task handle, then the task's returnvalue is never handled. This is similar to the Go problem with the go statement.
https://github.com/samsquire/multiversion-concurrency-contro...
If your async call returns a handle and
-
Small VMs and Coroutines
yield value2++
https://github.com/samsquire/multiversion-concurrency-contro...
I am still working on allowing multiple coroutines to be in flight in parallel at the same time. At the moment the tasks share the same background thread.
I asked this stackoverflow question regarding C++ coroutines, as I wanted to use coroutines with a thread pool.
https://stackoverflow.com/questions/74520133/how-can-i-pass-...
-
Hctree is an experimental high-concurrency database back end for SQLite
This is very interesting. Thank you for submitting this and thank you for working on this.
I am highly interested in parallelism and high concurrency. I implemented multiversion concurrency control in Java.
https://github.com/samsquire/multiversion-concurrency-contro...
I am curious how to handle replication with high concurrency. I'm not sure how you detect dangerous reads+writes to the same key (tuples/fields) across different replica machines. In other words, multiple master.
I am aware Google uses truetime and some form of timestamp ordering and detection of interfering timestamps. But I'm not sure how to replicate that.
I began working on an algorithm to synchronize database records, do a sort, then a hash for each row where hash(row) = hash(previous_row.hash + row.data)
Then do a binary search on hashes matching/not matching. This is a synchronization algorithm I'm designing that requires minimal data transfer but multiple round trips.
The binary search would check the end of the data set for hash(replica_a.row[last]) == hash(replica_b.row[last]) then split the hash list in half and check the middle item, this shall tell you which row and which columns are different.
-
Tail Call Optimization: The Musical
https://github.com/samsquire/multiversion-concurrency-contro...
I want to redesign the architecture of the async/await to be easier to understand. I want to use a state machine somehow.
-
Rust Atomics and Locks: Low-Level Concurrency in Practice
I wrote an unrolled state machine for my async/await in Java. This models a simple async/await program and runs tasks on other threads - without locks. I use a design I call token ring parallelism, where threads take turns and are linked together in a ring structure.
https://github.com/samsquire/multiversion-concurrency-contro...
I wrote a own lock free algorithm here that I use to do message passing between actor threads. My goal is high throughput performance and low latency.
https://github.com/samsquire/multiversion-concurrency-contro...
With 11 threads (on a 12 core processor, deliberately left one core for Windows)
-
A Compiler Writing Playground
I then started writing a parser for a high level language and then code generation from the AST to the imaginary assembly. My interpreter is multithreaded and can send integers between interpreters. It is very early and doesn't do much.
Code is at https://github.com/samsquire/multiversion-concurrency-contro...
The high level language looks similar to Javascript except I tried to parse everything as an expression. I need to parse functions as expressions.
I was experimenting with Protothreads in C recently to try understand how it worked and I wrote a giant switch statement and a while loop in Java to simulate async/await. It would be interesting to do codegen for coroutines.
here's that giant switch statement and scheduler https://github.com/samsquire/multiversion-concurrency-contro...
One idea for a stackless design I had was to preallocate memory for each method call for a call to that function and avoid a stack altogether. This would allow coroutines between methods and avoid the function colour problem because everything is a coroutine.
Is there any communities for programming language developers? Where do all the language developers meet up and talk theory and implementation? I am on eatonphil's discord and we talk there.
One problem I am trying to understand how to solve is how you would write a multithreaded interpreter and language that allowed parallel interpretation similar to C# and Java. If the allocator is thread safe and you share an object pool between interpreters and you hash object equality by sourcecode, then you could send objects between threads with only a synchronization cost.
I believe Python has the problem that object identity is different in each subinterpreter so you need to marshall the data.
cr-sqlite
-
Show HN: RemoteStorage – sync localStorage across devices and browsers
I'm a happy user of https://github.com/vlcn-io/cr-sqlite/
-
Marmot: Multi-writer distributed SQLite based on NATS
If you're interested in this, here are some related projects that all take slightly different approaches:
- LiteSync directly competes with Marmot and supports DDL sync, but is closed source commercial (similar to SQLite EE): https://litesync.io
- dqlite is Canonical's distributed SQLite that depends on c-raft and kernel-level async I/O: https://dqlite.io
- cr-sqlite is a Rust-based loadable extension that adds CRDT changeset generation and reconciliation to SQLite: https://github.com/vlcn-io/cr-sqlite
Slightly related but not really (no multi writer, no C-level SQLite API or other restrictions):
- comdb2 (Bloombergs multi-homed RDMS using SQLite as the frontend)
- rqlite: RDMS with HTTP API and SQLite as the storage engine, used for replication and strong consistency (does not scale writes)
- litestream/LiteFS: disaster recovery replication
- liteserver: active read-only replication (predecessor of LiteSync)
-
Offline eventually consistent synchronization using CRDTS
Theory is great, but how can we apply this in practice? Instead of starting from 0, and writing a CRDT, let's try and leverage an existing project to do the heavy lifting. My choice is crSQLITE, an extension for SQLite to support CRDT merging of databases. Under the hood, the extension creates tables to track changes and allow inserting into an event log for merging states of separated peers.
-
Local-first software: You own your data, in spite of the cloud (2019)
Also https://github.com/vlcn-io/cr-sqlite/ which is SQLite + CRDTs
Runs/syncs to the browser too which is just lovely.
-
I'm All-In on Server-Side SQLite
If you need multiple writers and can handle eventual correctness, you should really be using cr-sqlite[1]. It'll allow you to have any number of workers/clients that can write locally within the same process (so no network overhead) but still guarantee converge to the same state.
[1] https://github.com/vlcn-io/cr-sqlite
-
Show HN: ElectricSQL, Postgres to SQLite active-active sync for local-first apps
I am fully on the offline-first bandwagon after starting to use cr-sqlite (https://vlcn.io), which works similar to ElectricSQL.
I thought the bundle size of wasm-sqlite would be prohibitive, but it's surprisingly quick to download and boot. Reducing network reliance solves so many problems and corner-cases in my web app. Having access to local data makes everything very snappy too - the user experience is much better. Even if the user's offline data is wiped by the browser (offline storage limits are a bit of a minefield), it is straightforward to get all synced changes back from the server.
-
Launch HN: Tiptap (YC S23) – Toolkit for developing collaborative editors
I didn't know that. Especially the first approach sounds interesting to me, because as far as I know the transactions of Yjs seem to be a problem on heavily changing documents. https://github.com/vlcn-io/cr-sqlite#approach-1-history-free... Thanks!
- Scaling Linear's Sync Engine
-
Mycelite: SQLite extension to synchronize changes across SQLite instances
I wonder how this compares to https://vlcn.io?
-
Ask HN: Incremental View Maintenance for SQLite?
The short ask: Anyone know of any projects that bring incremental view maintenance to SQLite?
The why:
Applications are usually read heavy. It is a sad state of affairs that, for these kinds of apps, we don't put more work on the write path to allow reads to benefit.
Would the whole No-SQL movement ever even have been a thing if relational databases had great support for materialized views that updated incrementally? I'd like to think not.
And more context:
I'm working to push the state of "functional relational programming" [1], [2] further forward. Materialized views with incremental updates are key to this. Bringing them to SQLite so they can be leveraged one the frontend would solve this whole quagmire of "state management libraries." I've been solving the data-sync problem in SQLite (https://vlcn.io/) and this piece is one of the next logical steps.
If nobody knows of an existing solution, would love to collaborate with someone on creating it.
[1] - https://github.com/papers-we-love/papers-we-love/blob/main/design/out-of-the-tar-pit.pdf
What are some alternatives?
electric - Local-first sync layer for web and mobile apps. Build reactive, realtime, local-first apps directly on Postgres.
swift - the multiparty transport protocol (aka "TCP with swarming" or "BitTorrent at the transport layer")
marmot - A distributed SQLite replicator built on top of NATS
supercollider - An audio server, programming language, and IDE for sound synthesis and algorithmic composition.
vlcn-orm - Develop with your data model anywhere. Query and load data reactively. Replicate between peers without a central server.
dictomaton - Finite state dictionaries in Java
edgedb-go - The official Go client library for EdgeDB
hamt - A hash array-mapped trie implementation in C
imdbench - IMDBench — Realistic ORM benchmarking
multiversion-concurrency-control - Implementation of multiversion concurrency control, Raft, Left Right concurrency Hashmaps and a multi consumer multi producer Ringbuffer, concurrent and parallel load-balanced loops, parallel actors implementation in Main.java, Actor2.java and a parallel interpreter
edgedb-cli - The EdgeDB CLI