pond
blog
pond | blog | |
---|---|---|
1 | 10 | |
204 | 1,926 | |
0.0% | - | |
0.0 | 6.7 | |
about 1 year ago | 6 days ago | |
TypeScript | JavaScript | |
GNU General Public License v3.0 or later | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pond
-
[AskJS] JavaScript for data processing
We used to use a library called Pond.js, https://github.com/esnet/pond, but the reliance on Immutable.JS caused some performance pitfalls, so we wrote a system from scratch that deals with data in a batched streaming fashion. A lot of the concepts were borrowed from a Rust library called timely-dataflow, https://github.com/TimelyDataflow/timely-dataflow.
blog
- Advent of Code 2023 in Recursive SQL
-
Big Data Is Dead
This reminds me of a great blog post by Frank McSherry (Materialize, timely dataflow, etc) talking about how using the right tools on a laptop could beat out a bunch of these JVM distributed querying tools because... data locality basically.
https://github.com/frankmcsherry/blog/blob/master/posts/2015...
- Quokka and Spark/Databricks
- Rust for Data-Intensive Computation (2020)
- Cost in the Land of Databases (2017)
-
Show HN: Cozo – new Graph DB with Datalog, embedded like SQLite, written in Rust
Oh, cool!
And yeah, licenses can be challenging and frustrating, especially the first time you release a major project.
I am really super excited by the idea of embedded Datalog in Rust. I run into a lot of situations where I need something that fits in that awkward gap between SQL and Prolog. I want more expressiveness, better composability, and better graph support than SQL. But I also want finite-sized results that I can materialize in bounded time.
There has been some very neat work with incrementally-updated Datalog in the Rust community. For example, I think Datafrog is really neat: https://github.com/frankmcsherry/blog/blob/master/posts/2018... But it's great to see more neat projects in this space, so thank you.
- [AskJS] JavaScript for data processing
-
Differential Dataflow for Mere Mortals
They used to but Frank McSherry (author of differential dataflow) wrote them a specialized version without all the dataflow infrastructure [1]. It's part of the rust-lang nursery [2] now but hasn't been updated in a while, so I'm not sure what happened to it.
[1] https://github.com/frankmcsherry/blog/blob/master/posts/2018...
[2] https://github.com/rust-lang/datafrog
-
Why isn't differential dataflow more popular?
Importantly, this doesn't just use memoization (it actually avoids having to spend memory on that), but rather uses operators (nodes in the dataflow graph) that directly work with `(time, data, delta)` tuples. The `time` is a general lattice, so fairly flexible (e.g. for expressing loop nesting/recursive computations, but also for handling multiple input sources with their own timestamps), and the `delta` type is between a (potentially commutative) semigroup (don't be confused, they use addition as the group operation) and an abelian group. E.g. collections that are iteratively refined in loops often need an abelian `delta` type, while monoids (semigroup + explicit zero element) allow for efficient append-only computations [0].
[0]: https://github.com/frankmcsherry/blog/blob/master/posts/2019...
What are some alternatives?
btree-typescript - A reasonably fast in-memory B+ tree with a powerful API based on the standard Map. Small minified. Well documented.
differential-dataflow - An implementation of differential dataflow using timely dataflow on Rust.
persistent-ts - Persistent data structures for Typescript
Hydra - Functional hybrid modelling (FHM) language for modelling and simulation of physical systems using implicitly formulated (undirected) Differential Algebraic Equations (DAEs)
rslint - A (WIP) Extremely fast JavaScript and TypeScript linter and Rust crate
differential-datalog - DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.
ballista - Distributed compute platform implemented in Rust, and powered by Apache Arrow.
memray - Memray is a memory profiler for Python
reflow - A language and runtime for distributed, incremental data processing in the cloud
diagnostics - Diagnostic tools for timely dataflow computations
cozo - A transactional, relational-graph-vector database that uses Datalog for query. The hippocampus for AI!