incubator-gluten
opaque-sql
incubator-gluten | opaque-sql | |
---|---|---|
3 | 2 | |
988 | 176 | |
3.0% | 0.0% | |
9.9 | 1.1 | |
7 days ago | about 1 year ago | |
Scala | Scala | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
incubator-gluten
-
A glimpse into the future of data processing infrastructure.
When I first learned about the Gluten project from Intel, I thought Databricks was going to be in trouble.
- FLaNK Stack for 04 December 2023
-
Blaze: Fast query execution engine for Apache Spark
Interesting, looks like it is just DataFusion engine for Spark. There is a similar project: https://github.com/oap-project/gluten - it brings ClickHouse as an engine to Spark.
opaque-sql
-
How to Run Spark SQL on Encrypted Data
Introducing Opaque SQL, an open-source platform for securely running Spark SQL queries on encrypted data. Built by top systems and security researchers at UC Berkeley, the platform uses hardware enclaves to securely execute queries on private data in an untrusted environment.
-
Announcing MC²: Securely perform analytics and machine learning on confidential data
The MC2 Compute Services: MC2 offers several compute services: these include Spark SQL, distributed XGBoost, and secure aggregation for federated learning. All are intended to run in a primarily untrusted environment, such as a cluster of machines hosted on a public cloud, that has support for trusted execution environments (hardware enclaves). Data is encrypted in transit using a client key and only ever decrypted inside hardware enclaves, providing the previously mentioned security guarantees for data-in-use. For all compute services, MC2 leverages the Open Enclave SDK, a project intended to provide a consistent API for a variety of different enclave architectures.
What are some alternatives?
LearningSparkV2 - This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
kyuubi - Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
blaze - Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
mc2 - A Platform for Secure Analytics and Machine Learning
blaze - NumPy and Pandas interface to Big Data
secure-xgboost - Secure collaborative training and inference for XGBoost.
Jupyter Scala - A Scala kernel for Jupyter
cerebro - Cerebro: A platform for Secure Coopetitive Learning
delphi - A Cryptographic Inference Service for Neural Networks
narrator - David Attenborough narrates your life
secure-aggregation - Secure aggregation for federated learning using enclaves