flow
differential-datalog
flow | differential-datalog | |
---|---|---|
2 | 22 | |
1,479 | 1,338 | |
0.5% | 0.1% | |
3.4 | 0.0 | |
10 months ago | 10 months ago | |
Elixir | Java | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
flow
-
Switching to Elixir
You can actually have "background jobs" in very different ways in Elixir.
> I want background work to live on different compute capacity than http requests, both because they have very different resources usage
In Elixir, because of the way the BEAM works (the unit of parallelism is much cheaper and consume a low amount of memory), "incoming http requests" and related "workers" are not as expensive (a lot less actually) compared to other stacks (for instance Ruby and Python), where it is quite critical to release "http workers" and not hold the connection (which is what lead to the creation of background job tools like Resque, DelayedJob, Sidekiq, Celery...).
This means that you can actually hold incoming HTTP connections a lot longer without troubles.
A consequence of this is that implementing "reverse proxies", or anything calling third party servers _right in the middle_ of your own HTTP call, is usually perfectly acceptable (something I've done more than a couple of times, the latest one powering the reverse proxy behind https://transport.data.gouv.fr - code available at https://github.com/etalab/transport-site/tree/master/apps/un...).
As a consequence, what would be a bad pattern in Python or Ruby (holding the incoming HTTP connection) is not a problem with Elixir.
> because I want to have state or queues in front of background work so there's a well-defined process for retry, error handling, and back-pressure.
Unless you deal with immediate stuff like reverse proxying or cheap "one off async tasks" (like recording a metric), there also are solutions to have more "stateful" background works in Elixir, too.
A popular background job queue is https://github.com/sorentwo/oban (roughly similar to Sidekiq at al), which uses Postgres.
It handles retries, errors etc.
But it's not the only solution, as you have other tools dedicated to processing, such as Broadway (https://github.com/dashbitco/broadway), which handles back-pressure, fault-tolerance, batching etc natively.
You also have more simple options, such as flow (https://github.com/dashbitco/flow), gen_stage (https://github.com/elixir-lang/gen_stage), Task.async_stream (https://hexdocs.pm/elixir/1.12/Task.html#async_stream/5) etc.
It allows to use the "right tool for the job" quite easily.
It is also interesting to note there is no need to "go evented" if you need to fetch data from multiple HTTP servers: it can happen in the exact same process (even: in a background task attached to your HTTP server), as done here https://transport.data.gouv.fr/explore (if you zoom you will see vehicle moving in realtime, and ~80 data sources are being polled every 10 seconds & broadcasted to the visitors via pubsub & websockets).
-
An opinionated map of incremental and streaming systems (2018)
Elixir has a few interesting abstractions for that: GenStage, Flow, Broadway.
https://github.com/dashbitco/flow
differential-datalog
- DDlog: A programming language for incremental computation
-
Feldera – a more performant streaming database based on Z-sets
Hi,
> I wonder if it lives up to the hype.
We do think so! (disclaimer: I'm a co-founder at Feldera)
To give some more background: We are co-designing/trialing feldera with several industry/enterprise partners from different domains. Our core team also built differential datalog (https://github.com/vmware/differential-datalog) in the past. And while ddlog is used quite successfully in products today, we believe the many lessons we learned with ddlog will help us to build an even better continuous analytics platform. FYI our code is open-source at https://github.com/feldera/feldera if you'd like to try it out.
Also feel free to join our community slack channel (https://www.feldera.com/slack/) if you have more questions.
-
Why Are There No Relational DBMSs? [pdf]
The relational model (and generally working at the level of sets/collections, instead of the level of individual values/objects) actually makes it easier to have this kind of incremental computation in a consistent way, I think.
There's a bunch of work being done on making relational systems work this way. Some interesting reading:
- https://www.scattered-thoughts.net/writing/an-opinionated-ma...
- https://materialize.com/ which is built on https://timelydataflow.github.io/differential-dataflow/, which has a lot of research behind it
- Which also can be a compilation target for Datalog: https://github.com/vmware/differential-datalog
- Some prototype work on building UI systems in exactly the way you describe using a relational approach: https://riffle.systems/essays/prelude/ (and HN discussion: https://news.ycombinator.com/item?id=30530120)
(There's a lot more too -- I have a hobby interest in this space, so I have a small collection of links)
-
Differential Datalog: a programming language for incremental computation
Tutorial which I didn’t see linked in the README: https://github.com/vmware/differential-datalog/blob/master/d...
-
Show HN: Cozo – new Graph DB with Datalog, embedded like SQLite, written in Rust
This is amazing!
Have you looked at differential-datalog? It's rust-based, maintained by VMWare, and has a very rich, well-typed Datalog language. differential-datalog is in-memory only right now, but could be ideal to integrate your graph as a datastore or disk spill cache.
https://github.com/vmware/differential-datalog
-
Help wanted!
Sort of related, in my mind at least, is differential dataflow, e.g. https://github.com/vmware/differential-datalog
-
Datalog in JavaScript
It’s fascinating to see so many different parties converging on Datalog for reactive apps & UI.
- There are several such talks at https://www.hytradboi.com/ (happening this Friday)
- Roam Research and its clones Athens, Logseq, use Datascript / ClojureScript https://github.com/tonsky/datascript
- differential-datalog isn’t an end-to-end system, but is highly optimized for quick reactivity https://github.com/vmware/differential-datalog
- Datalog UI is a Typescript port of some of differential-datalog’s ideas https://datalogui.dev/
-
Call for Help - Open Source Datom/EAV/Fact database in Rust.
Rust related https://github.com/vmware/differential-datalog
-
Anything like Svelte/Jetpack Compose for Haskell?
Actually, that makes me wonder whether or not differential datalog falls under that umbrella, and if it could be applied in the same way Compose is.
What are some alternatives?
parallel_stream - A parallelized stream implementation for Elixir
scryer-prolog - A modern Prolog implementation written mostly in Rust.
MapDiff - Calculates the difference between two (nested) maps, and returns a map representing the patch of changes.
timely-dataflow - A modular implementation of timely dataflow in Rust
fsm - Finite State Machine data structure
materialize - The data warehouse for operational workloads.
graphmath - An Elixir library for performing 2D and 3D mathematics.
differential-dataflow - An implementation of differential dataflow using timely dataflow on Rust.
witchcraft - Monads and other dark magic for Elixir
datalevin - A simple, fast and versatile Datalog database
matrex - A blazing fast matrix library for Elixir/Erlang with C implementation using CBLAS.
logica - Logica is a logic programming language that compiles to SQL. It runs on Google BigQuery, PostgreSQL and SQLite.