-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
risingwave
SQL stream processing, analytics, and management. PostgreSQL simplicity, unrivaled performance, and seamless elasticity. π 10x more productive. π 10x more cost-efficient.
As mentioned in the blog post, clusters allow horizontal scalability and daisy chaining, so you can allocate more memory for your views even if you run up against the limits of how much memory you can fit on a single machine. We've got plans in the works to support out-of-core execution, too.
> Also they do not integrate at all with custom data types in Postgres IME. E.g. an enumeration in your table will mean materialize canβt read the table as a source. Lame.
We're aware of this and are working on a fix. There are two tracking issues, if you'd like to follow along:
* #6818 (https://github.com/MaterializeInc/materialize/issues/6818) is specifically about supporting PostgreSQL enum types
We do something similar, but in 2), instead of using the outbox pattern, we make use (in several different settings) of integers that are guaranteed to increment in commit order, then each consumer can track where their cursor is on the feed of changes. This requires some more coordination but it means that publishers of changes don't need one outbox per consumer or similar.
Then you can have "processes" that query for new data in an input table, and update aggregates/derived tables from that simply by "select * ... where ChangeSequenceNumber > @MaxSequenceNumberFromPreviousExecution"...
The idea here implemented for Microsoft SQL for the OLTP case:
https://github.com/vippsas/mssql-changefeed
I use PG with an alternative materialized views implementation[0] that is pure PlPgSQL and that exposes real tables that can be used to write to in triggers, and where the views can be marked stale too.
This means hand-coding triggers to keep the materializations up to date, or else to mark them as out of date (because maybe some operations would be slow or hard to hand-code triggers for), but this works remarkably well.
As a bonus, I get an update history table that can be used to generate updates to external systems.
In principle one can get the AST for a VIEW's query from the PG catalog and use that generate triggers on the tables it queries to keep it up to date. In practice that's only trivial for some kinds of queries, and I've not written such a tool yet.
[0] https://github.com/twosigma/postgresql-contrib/blob/master/m...
Please also take a look at https://github.com/risingwavelabs/risingwave if you are looking for advanced streaming databases. It is under Apache License and also support on-prem deployment (docker, kubernetes) with full function set of distributed clustering, compute-storage disaggregation, etc..
Related posts
-
Proton, a fast and lightweight alternative to Apache Flink
-
We Built a Streaming SQL Engine
-
Query Real Time Data in Kafka Using SQL
-
What makes a time series oriented database (ex: QuestDB) more efficient for OLAP on time series than an OLAP "only" oriented database (ex: DuckDB) technically?
-
How to handle partial updates and bulk updates in the source systems