Upload to S3 -> AWS lambda with some Scala Spark code -> Process -> Write back to S3

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

s3-sqs-connector

6 16 0.0 Scala

A library for reading data from Amzon S3 with optimised listing using Amazon SQS using Spark SQL Streaming ( or Structured streaming).

Are you planning on uploading and processing many files to S3? If so I would use something like Structured Streaming with the FileSource which can detect new files uploaded to S3 and process them in on a "standard" Spark cluster. You can then build a very easy to deploy and operate cluster on EKS/Kubernetes. I would check out: https://github.com/qubole/s3-sqs-connector once the number of files you upload start to get really large. Glue could also be used to achieve roughly the same thing and without the hassle of managing the EKS/K8s clusters.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Upload to S3 -> AWS lambda with some Scala Spark code -> Process -> Write back to S3

This page summarizes the projects mentioned and recommended in the original post on /r/scala
Scala Spark Streaming structured-streaming spark-streaming
Post date: 27 May 2021

s3-sqs-connector

WorkOS

Related posts

Upload to S3 -&gt; AWS lambda with some Scala Spark code -&gt; Process -&gt; Write back to S3

This page summarizes the projects mentioned and recommended in the original post on /r/scala Scala Spark Streaming structured-streaming spark-streaming Post date: 27 May 2021

s3-sqs-connector

WorkOS

Related posts

Upload to S3 -> AWS lambda with some Scala Spark code -> Process -> Write back to S3

This page summarizes the projects mentioned and recommended in the original post on /r/scala
Scala Spark Streaming structured-streaming spark-streaming
Post date: 27 May 2021