differential-dataflow
cloudquery
differential-dataflow | cloudquery | |
---|---|---|
14 | 102 | |
2,473 | 5,584 | |
0.8% | 0.9% | |
8.3 | 10.0 | |
6 days ago | 7 days ago | |
Rust | Go | |
MIT License | Mozilla Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
differential-dataflow
-
We Built a Streaming SQL Engine
Some recent solutions to this problem include Differential Dataflow and Materialize. It would be neat if postgres adopted something similar for live-updating materialized views.
https://github.com/timelydataflow/differential-dataflow
https://materialize.com/
-
Hydroflow: Dataflow Runtime in Rust
I'm looking for this but can't find it, how does this project compare to differential dataflow?
As a sibling commenter mentioned, it's built on timely dataflow (which is lower-level), but that already has differential dataflow[0] built on top of it by the same authors.
How do they differ?
[0]: https://github.com/TimelyDataflow/differential-dataflow
- Using Rust to write a Data Pipeline. Thoughts. Musings.
- PlanetScale Boost
- Program Synthesis is Possible (2018)
-
Convex vs. Firebase
hi! sujay from convex here. I remember reading about your "reverse query engine" when we were getting started last year and really liking that framing of the broadcast problem here.
as james mentions, we entirely re-run the javascript function whenever we detect any of its inputs change. incrementality at this layer would be very difficult, since we're dealing with a general purpose programming language. also, since we fully sandbox and determinize these javascript "queries," the majority of the cost is in accessing the database.
eventually, I'd like to explore "reverse query execution" on the boundary between javascript and the underlying data using an approach like differential dataflow [1]. the materialize folks [2] have made a lot of progress applying it for OLAP and readyset [3] is using similar techniques for OLTP.
[1] https://github.com/TimelyDataflow/differential-dataflow
[2] https://materialize.com/
[3] https://readyset.io/
-
Announcing avalanche 0.1, a React- and Svelte-inspired GUI library
differential dataflow which is used to power materialize db
-
Differential Datalog
It's partially inspired by Linq, so the similarity you see is expected.
It's not really arbitrary structures so much, though you're mostly free in what record type you use in a relation (structs and tagged enums are typical, though).
The incremental part is that you can feed it changes to the input (additions/retractions of facts) and get changes to the outputs back with low latency (you can alternatively just use it to keep an index up-to-date, where you can quickly look up based on a key (like a materialized view in SQL)).
This [0] section in the readme of the underlying incremental dataflow framework may help get the concept across, but feel free to follow up if you're still not seeing the incrementality.
[0]: https://github.com/TimelyDataflow/differential-dataflow#an-e...
- Dbt and Materialize
- Materialized view questions
cloudquery
-
We might want to regularly keep track of how important each server is
Check out CloudQuery - https://github.com/cloudquery/cloudquery for an easy cloud asset inventory.
-
Cloud asset tracking
There both do something like what you're looking for.... https://github.com/cloudquery/cloudquery https://github.com/openraven/magpie
-
Show HN: Nango – Open unified API for product integrations
Unified API is a holly grail but as many said quite difficult to abstract every use case in a scalable way that won't break. At CloudQuery (https://github.com/cloudquery/cloudquery) we focus solely on the ELT use-case(Founder/Maintainer here).
-
Welcome to Datasette Cloud
Congrats!! How does it compare to the ELT space and the modern data stack where you have ingestion/storage/visualization layers decoupled?
Asking as the founder of CloudQuery (https://github.com/cloudquery/cloudquery), Saw Datasette quite a few times around data exploration but curious to hear about the most popular use-cases of Datasette!
-
Launch HN: PeerDB (YC S23) – Fast, Native ETL/ELT for Postgres
Congrats!! We also focus on performance at CloudQuery (https://github.com/cloudquery/cloudquery) by using Golang, gRPC and still trying to be abstract enough to support different databases :)
In any case good luck!
-
airbyte VS cloudquery - a user suggested alternative
2 projects | 2 Jun 2023
CloudQuery for ETL
2 projects | 2 Jun 2023Another ELT framework that's an alternative to Airbyte
-
meltano VS cloudquery - a user suggested alternative
2 projects | 2 Jun 2023
Another alternate ELT
-
RDS to S3 Options
Check out CloudQuery, we have PostgreSQL source connectors and S3 destination that supports parquet (Disclaimer: Maintainer and founder here)
-
Cloudquery, Resoto, Steampipe, or Airbyte?
Hello! Im Yevgeny, Founder & maintainer at CloudQuery . We've built CloudQuery as an open source high performance ELT framework so you should get pretty good results syncing all your cloud assets from high number of accounts (we have users syncing more than 10K Azure subscription and thousands of AWS accounts concurrently).
What are some alternatives?
ballista - Distributed compute platform implemented in Rust, and powered by Apache Arrow.
steampipe - Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
materialize - The data warehouse for operational workloads.
steampipe-mod-aws-compliance - Run individual controls or full compliance benchmarks for CIS, PCI, NIST, HIPAA and more across all of your AWS accounts using Powerpipe and Steampipe.
reflow - A language and runtime for distributed, incremental data processing in the cloud
cloud-custodian - Rules engine for cloud security, cost optimization, and governance, DSL in yaml for policies to query, filter, and take actions on resources
differential-datalog - DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.
cloudsploit - Cloud Security Posture Management (CSPM)
timely-dataflow - A modular implementation of timely dataflow in Rust
cartography - Cartography is a Python tool that consolidates infrastructure assets and the relationships between them in an intuitive graph view powered by a Neo4j database.
clj-3df - Clojure(Script) client for Declarative Dataflow.
opencspm - Open Cloud Security Posture Management Engine