timely-dataflow VS blog

Compare timely-dataflow vs blog and see what are their differences.

timely-dataflow

A modular implementation of timely dataflow in Rust (by TimelyDataflow)

blog

Some notes on things I find interesting and important. (by frankmcsherry)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
timely-dataflow blog
11 10
3,145 1,923
1.1% -
7.2 6.9
19 days ago about 1 month ago
Rust JavaScript
MIT License -
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

timely-dataflow

Posts with mentions or reviews of timely-dataflow. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-21.
  • Readyset: A MySQL and Postgres wire-compatible caching layer
    5 projects | news.ycombinator.com | 21 Feb 2024
    They have a bit about their technical foundation here[0].

    Given that Readyset was co-founded by Jon Gjengset (but has apparently since departed the company), who authored the paper on Noria[1], I would assume that Readyset is the continuation of that research.

    So it shares some roots with Materialize. They have a common conceptual ancestry in Naiad, where Materialize evolved out of timely-dataflow.

    [0]: https://docs.readyset.io/concepts/streaming-dataflow

    [1]: https://jon.thesquareplanet.com/papers/osdi18-noria.pdf

    [2]: https://dl.acm.org/doi/10.1145/2517349.2522738

    [3]: https://github.com/TimelyDataflow/timely-dataflow

  • Mandala: experiment data management as a built-in (Python) language feature
    4 projects | /r/ProgrammingLanguages | 11 Apr 2023
    And systems like timely dataflow, https://github.com/TimelyDataflow/timely-dataflow
  • Arroyo: A distributed stream processing engine written in Rust
    3 projects | /r/rust | 4 Apr 2023
    Project looks cool! Glad you open sourced it. It could use some comments in the code base to help contributors ;). I also like the datafusion usage, that is awesome. BTW I work on github.com/bytewax/bytewax, which is based on https://github.com/TimelyDataflow/timely-dataflow another Rust dataflow computation engine.
  • Rust MPI -- Will there ever be a fully oxidized implementation?
    4 projects | /r/rust | 5 Mar 2023
    Just found this https://github.com/TimelyDataflow/timely-dataflow and my heart skipped a beat.
  • Streaming processing in Python using Timely Dataflow with Bytewax
    1 project | /r/Python | 9 Nov 2022
    Bytewax is a Python native binding to the Timely Dataflow library (written in Rust) for building highly scalable streaming (and batch) processing pipelines.
  • Alternative Kafka Integration Framework to Kafka Connect?
    3 projects | /r/apachekafka | 21 Jun 2022
    I am working on Bytewax, which is a Python stream processing framework built on Timely Dataflow. It is not exactly a Kafka integration framework because it is a more of a general stream processing framework, but might be interesting for you. We are focused on enabling people to more easily debug, containerize, parallelize and customize and less on enabling a declarative integration framework. It is still early days for us! And we are looking for feedback and ideas from the community.
  • [AskJS] JavaScript for data processing
    5 projects | /r/javascript | 27 May 2022
    We used to use a library called Pond.js, https://github.com/esnet/pond, but the reliance on Immutable.JS caused some performance pitfalls, so we wrote a system from scratch that deals with data in a batched streaming fashion. A lot of the concepts were borrowed from a Rust library called timely-dataflow, https://github.com/TimelyDataflow/timely-dataflow.
  • Dataflow: An Efficient Data Processing Library for Machine Learning
    2 projects | /r/rust | 17 Jan 2022
    Though the name "Dataflow" might be an unfortunate name conflict with another Rust project: https://github.com/TimelyDataflow/timely-dataflow
  • Ask HN: Is there a way to subscribe to an SQL query for changes?
    17 projects | news.ycombinator.com | 22 Apr 2021
    > In the simplest case, I'm talking about regular SQL non-materialized views which are essentially inlined.

    I see that now -- makes sense!

    > Wish we had some better database primitives to assemble rather than building everything on Postgres - its not ideal for a lot of things.

    I'm curious to hear more about this! We agree that better primitives are required and that's why Materialize is written in Rust using using TimelyDataflow[1] and DifferentialDataflow[2] (both developed by Materialize co-founder Frank McSherry). The only relationship between Materialize and Postgres is that we are wire-compatible with Postgres and we don't share any code with Postgres nor do we have a dependence on it.

    [1] https://github.com/TimelyDataflow/timely-dataflow

  • 7 Real-Time Data Streaming Tools You Should Consider On Your Next Project
    2 projects | dev.to | 20 Mar 2021
    Under the hood, Materialize uses Timely Dataflow (TDF) as the stream-processing engine. This allows Materialize to take advantage of the distributed data-parallel compute engine. The great thing about using TDF is that it has been in open source development since 2014 and has since been battle-tested in production at large Fortune 1000-scale companies.

blog

Posts with mentions or reviews of blog. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-07.
  • Advent of Code 2023 in Recursive SQL
    1 project | news.ycombinator.com | 18 Jan 2024
  • Big Data Is Dead
    3 projects | news.ycombinator.com | 7 Feb 2023
    This reminds me of a great blog post by Frank McSherry (Materialize, timely dataflow, etc) talking about how using the right tools on a laptop could beat out a bunch of these JVM distributed querying tools because... data locality basically.

    https://github.com/frankmcsherry/blog/blob/master/posts/2015...

  • Quokka and Spark/Databricks
    2 projects | /r/dataengineering | 1 Jan 2023
  • Rust for Data-Intensive Computation (2020)
    1 project | news.ycombinator.com | 24 Nov 2022
  • Cost in the Land of Databases (2017)
    1 project | news.ycombinator.com | 24 Nov 2022
  • Show HN: Cozo – new Graph DB with Datalog, embedded like SQLite, written in Rust
    8 projects | news.ycombinator.com | 8 Nov 2022
    Oh, cool!

    And yeah, licenses can be challenging and frustrating, especially the first time you release a major project.

    I am really super excited by the idea of embedded Datalog in Rust. I run into a lot of situations where I need something that fits in that awkward gap between SQL and Prolog. I want more expressiveness, better composability, and better graph support than SQL. But I also want finite-sized results that I can materialize in bounded time.

    There has been some very neat work with incrementally-updated Datalog in the Rust community. For example, I think Datafrog is really neat: https://github.com/frankmcsherry/blog/blob/master/posts/2018... But it's great to see more neat projects in this space, so thank you.

  • [AskJS] JavaScript for data processing
    5 projects | /r/javascript | 27 May 2022
  • Differential Dataflow for Mere Mortals
    3 projects | news.ycombinator.com | 15 Jun 2021
    They used to but Frank McSherry (author of differential dataflow) wrote them a specialized version without all the dataflow infrastructure [1]. It's part of the rust-lang nursery [2] now but hasn't been updated in a while, so I'm not sure what happened to it.

    [1] https://github.com/frankmcsherry/blog/blob/master/posts/2018...

    [2] https://github.com/rust-lang/datafrog

  • Why isn't differential dataflow more popular?
    13 projects | news.ycombinator.com | 22 Jan 2021
    Importantly, this doesn't just use memoization (it actually avoids having to spend memory on that), but rather uses operators (nodes in the dataflow graph) that directly work with `(time, data, delta)` tuples. The `time` is a general lattice, so fairly flexible (e.g. for expressing loop nesting/recursive computations, but also for handling multiple input sources with their own timestamps), and the `delta` type is between a (potentially commutative) semigroup (don't be confused, they use addition as the group operation) and an abelian group. E.g. collections that are iteratively refined in loops often need an abelian `delta` type, while monoids (semigroup + explicit zero element) allow for efficient append-only computations [0].

    [0]: https://github.com/frankmcsherry/blog/blob/master/posts/2019...

What are some alternatives?

When comparing timely-dataflow and blog you can also consider the following projects:

noria - Fast web applications through dynamic, partially-stateful dataflow

differential-dataflow - An implementation of differential dataflow using timely dataflow on Rust.

differential-datalog - DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.

btree-typescript - A reasonably fast in-memory B+ tree with a powerful API based on the standard Map. Small minified. Well documented.

materialize - The data warehouse for operational workloads.

rslint - A (WIP) Extremely fast JavaScript and TypeScript linter and Rust crate

bytewax - Python Stream Processing

Hydra - Functional hybrid modelling (FHM) language for modelling and simulation of physical systems using implicitly formulated (undirected) Differential Algebraic Equations (DAEs)

realtime - Broadcast, Presence, and Postgres Changes via WebSockets

pond - Immutable timeseries data structures built with Typescript