The Next Generation of Materialize

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • materialize

    The data warehouse for operational workloads. (by MaterializeInc)

  • As mentioned in the blog post, clusters allow horizontal scalability and daisy chaining, so you can allocate more memory for your views even if you run up against the limits of how much memory you can fit on a single machine. We've got plans in the works to support out-of-core execution, too.

    > Also they do not integrate at all with custom data types in Postgres IME. E.g. an enumeration in your table will mean materialize can’t read the table as a source. Lame.

    We're aware of this and are working on a fix. There are two tracking issues, if you'd like to follow along:

    * #6818 (https://github.com/MaterializeInc/materialize/issues/6818) is specifically about supporting PostgreSQL enum types

  • mssql-changefeed

  • We do something similar, but in 2), instead of using the outbox pattern, we make use (in several different settings) of integers that are guaranteed to increment in commit order, then each consumer can track where their cursor is on the feed of changes. This requires some more coordination but it means that publishers of changes don't need one outbox per consumer or similar.

    Then you can have "processes" that query for new data in an input table, and update aggregates/derived tables from that simply by "select * ... where ChangeSequenceNumber > @MaxSequenceNumberFromPreviousExecution"...

    The idea here implemented for Microsoft SQL for the OLTP case:

    https://github.com/vippsas/mssql-changefeed

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ksql

    The database purpose-built for stream processing applications.

  • postgresql-contrib

  • I use PG with an alternative materialized views implementation[0] that is pure PlPgSQL and that exposes real tables that can be used to write to in triggers, and where the views can be marked stale too.

    This means hand-coding triggers to keep the materializations up to date, or else to mark them as out of date (because maybe some operations would be slow or hard to hand-code triggers for), but this works remarkably well.

    As a bonus, I get an update history table that can be used to generate updates to external systems.

    In principle one can get the AST for a VIEW's query from the PG catalog and use that generate triggers on the tables it queries to keep it up to date. In practice that's only trivial for some kinds of queries, and I've not written such a tool yet.

    [0] https://github.com/twosigma/postgresql-contrib/blob/master/m...

  • risingwave

    SQL stream processing, analytics, and management. PostgreSQL simplicity, unrivaled performance, and seamless elasticity. πŸš€ 10x more productive. πŸš€ 10x more cost-efficient.

  • Please also take a look at https://github.com/risingwavelabs/risingwave if you are looking for advanced streaming databases. It is under Apache License and also support on-prem deployment (docker, kubernetes) with full function set of distributed clustering, compute-storage disaggregation, etc..

  • pg_ivm

    IVM (Incremental View Maintenance) implementation as a PostgreSQL extension

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Proton, a fast and lightweight alternative to Apache Flink

    7 projects | news.ycombinator.com | 30 Jan 2024
  • We Built a Streaming SQL Engine

    3 projects | news.ycombinator.com | 21 Oct 2023
  • Query Real Time Data in Kafka Using SQL

    7 projects | dev.to | 23 Mar 2023
  • What makes a time series oriented database (ex: QuestDB) more efficient for OLAP on time series than an OLAP "only" oriented database (ex: DuckDB) technically?

    1 project | /r/dataengineering | 23 Jan 2023
  • How to handle partial updates and bulk updates in the source systems

    1 project | /r/dataengineering | 5 Jan 2023