Our great sponsors
-
zillion
Make sense of it all. Semantic data modeling and analytics with a sprinkle of AI. https://totalhack.github.io/zillion/
-
starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I've also been frustrated when testing out tools that kinda keep you locked into one predetermined view, table, or set of tables at a time. I made a semantic data modeling library that puts together queries (and of course joins) for you as it uses a drill-across querying technique, and can also join data across different data sources in a secondary execution layer.
https://github.com/totalhack/zillion
Disclaimer: this project is currently a one man show, though I use it in production at my own company.
I think you're talking about doing denormalization before importing data into an OLAP system to avoid subsequent joins. However, this greatly limits the flexibility of data modeling. Moreover, denormalization can be a headache-inducing process. In fact, I have tested StarRocks (https://github.com/StarRocks/starrocks), and it is capable of performing joins while streaming data imports, and the speed is very fast. It's worth giving it a try.
Related posts
- Loading a trillion rows of weather data into TimescaleDB
- 🪄 DuckDB sql hack : get things SORTED w/ constraint CHECK
- We Built a 19 PiB Logging Platform with ClickHouse and Saved Millions
- Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis
- 42.parquet – A Zip Bomb for the Big Data Age