-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
TimescaleDB
An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
ClickHouse [1] is a "fast open source (almost)SQL analytics database" you are looking for :)
[1] https://clickhouse.com/
https://materialize.com/ is billed to be that. That behavior is not trivial to implement.
Interesting. I check back on the wiki page every couple of years to see what progress is being made.
The basic idea seems to be to track the primary keys of the base tables in the incremental view and then use triggers to update those rows when source rows are updated.
The meat of the project is over here for anyone that’s interested - in particular this section about the limitations is pretty interesting (and expected).
https://github.com/sraoss/pg_ivm/#supported-view-definitions...
(NB - blog author/Timescale employee)
One thing we're improving as we move forward in documentation and other areas is explaining why doing joins (and things like window functions) is difficult in continuous aggregates and not the current focus. Honestly, it's part of the reason most databases haven't tackled this problem before.
Once you add in joins or things that might refer to data outside of the refresh window (LAG values for example), things get really complicated. For instance, if you join to a dimension table and a piece of metadata changes, does that change now need to be updated and reflected back in all of this historical aggregate data that's outside of the automatic refresh policy? Same with a window function - if data within a window hasn't changed but data that *might* be hit because of the window function reference does change, continuous aggregates would have to know about that for each query and track those changes too.
I'm not saying it's impossible or that it won't be solved someday, but the functionality with continuous aggregates that keeps the aggregate data updated automatically (without losing any history) *and* being able to perform fast joins on the finalized data is a very useful step that's not available anywhere else within the Postgres ecosystem.
RE: CAGG on top of a CAGG - you're certainly not the only person to request this[1] () and we understand that. Part of this is because of what I discussed above (tracking changes across multiple tables), although having finalized data might make this more possible in the future.
That said (!!!), the cool thing is that we already *have* begun to solve this problem with hyperfunction aggregates and 2-step aggregation, something I showed in the blog post. So, if your dataset can benefit from one of the hyperfunction aggregates that we currently provide, there are lots of cool things you can do with it, including rollups into bigger buckets without creating a second continuous aggregate! If you haven't checked them out, please do! [2][3]
[1]: https://github.com/timescale/timescaledb/issues/1400