qbeast-spark vs Local-Data-LakeHouse

qbeast-spark

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary! (by Qbeast-io)

Source Code

qbeast.io

Suggest alternative

Edit details

Local-Data-LakeHouse

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing. (by dominikhei)

apache-iceberg data-lake data-lakehouse hive-metastore lakehouse Minio trino

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

qbeast-spark		Local-Data-LakeHouse
	Project
12	Mentions	1
192	Stars	44
4.7%	Growth	-
8.6	Activity	4.4
4 days ago	Latest Commit	8 months ago
Scala	Language	Dockerfile
Apache License 2.0	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

qbeast-spark

Posts with mentions or reviews of qbeast-spark. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-09.

Release 0.3.2 of qbeast-spark!
1 project | /r/apachespark | 14 Mar 2023

1 project | /r/dataengineering | 14 Mar 2023
Qbeast-Spark Visualizer!
1 project | /r/apachespark | 16 Feb 2023
Release 0.3.1 of Qbeast Spark
1 project | /r/apachespark | 23 Dec 2022

1 project | /r/dataengineering | 23 Dec 2022
Collaborative roadmap for qbeast-spark: Open Source Table Format
1 project | /r/apachespark | 7 Jun 2022

We want to develop qbeast-spark in an open way, so we publish a tentative Roadmap for this summer https://github.com/Qbeast-io/qbeast-spark/discussions/108
qbeast-spark v0.2.0 available on Maven Central Repository
1 project | /r/dataengineering | 4 Apr 2022
Datasource enabling multidimensional indexing and sampling pushdown
3 projects | dev.to | 9 Mar 2022

If you want to play with it, check out the Qbeast-Spark github
Apache Spark Datasource enabling multidimensional indexing and sampling pushdown
1 project | /r/coolgithubprojects | 2 Mar 2022
New DataSource enabling multi-columnar indexing and efficient data sampling
1 project | /r/opensource | 1 Mar 2022

Local-Data-LakeHouse

Posts with mentions or reviews of Local-Data-LakeHouse. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-04.

Project showcase: sample Data Lakehouse
2 projects | /r/dataengineering | 4 Apr 2023

Here is the Github repo: https://github.com/dominikhei/Local-Data-LakeHouse

What are some alternatives?

When comparing qbeast-spark and Local-Data-LakeHouse you can also consider the following projects:

Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing

matano - Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS

delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

incubator-xtable - Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Spark Utils - Basic framework utilities to quickly start writing production ready Apache Spark applications

minio-dokku - Dockerfile to run Minio (S3 compatible storage) on Dokku (mini-Heroku)

mmlspark - Simple and Distributed Machine Learning [Moved to: https://github.com/microsoft/SynapseML]

cuelake - Use SQL to build ELT pipelines on a data lakehouse.

Clustering4Ever - C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

hive-metastore - Apache Hive Metastore as a Standalone server in Docker

Sparkplug - Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌

Rudderstack - Privacy and Security focused Segment-alternative, in Golang and React

qbeast-spark vs Apache Spark Local-Data-LakeHouse vs matano qbeast-spark vs delta Local-Data-LakeHouse vs incubator-xtable qbeast-spark vs Spark Utils Local-Data-LakeHouse vs minio-dokku qbeast-spark vs mmlspark Local-Data-LakeHouse vs cuelake qbeast-spark vs Clustering4Ever Local-Data-LakeHouse vs hive-metastore qbeast-spark vs Sparkplug Local-Data-LakeHouse vs Rudderstack

Compare qbeast-spark vs Local-Data-LakeHouse and see what are their differences.

qbeast-spark

Local-Data-LakeHouse

qbeast-spark

Local-Data-LakeHouse

What are some alternatives?