Our great sponsors
-
Apache Arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Yes, I think to the extent that the open-source Spark has support for columnar data exchange. I think some/much of the work has been done in the last 2 years (see https://github.com/apache/spark/pull/24795/files), but I don't now to what extent one could completely build out the execution part in Spark 3.0 or 3.1.
So, in my day job at NVIDIA, I work on the RAPIDS Accelerator for Apache Spark, which is an open-source plugin that provides GPU-acceleration for ETL workloads, leveraging the RAPIDS cuDF GPU DataFrame library.