Datajob: Build and deploy a serverless data pipeline on AWS with no effort.

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

datajob

4 107 0.0 Python

Build and deploy a serverless data pipeline on AWS with no effort.

I have been working on Datajob, a library that helps me ship my data pipeline to AWS with at least configuration as code as possible and I'm curious if other people can use this. I have a minimal version that lets you package your code and its dependencies to AWS Glue python/pyspark jobs and orchestrates it using step functions as simple as task1 >> task2 >> task3

Moto

32 7,374 9.9 Python

A library that allows you to easily mock out tests based on AWS infrastructure.

- One way to test the functionality is to use pytest/unittest/... in combination with moto. I wrote a medium article more than a year ago that gives an example on how you can test glue pyspark jobs: https://towardsdatascience.com/testing-glue-pyspark-jobs-4b544d62106e

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
getting-started

16 1,220 0.0 Makefile

This repository is a getting started guide to Singer. (by singer-io)

If i'm not mistaken, singer.io are scripts that move data around. Datajob can help you deploy and orchestrate these singer.io scripts to AWS Glue.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project