datajob
Build and deploy a serverless data pipeline on AWS with no effort. (by vincentclaes)
DataEngineeringProject
Example end to end data engineering project. (by damklis)
Our great sponsors
datajob | DataEngineeringProject | |
---|---|---|
4 | 5 | |
107 | 985 | |
- | - | |
0.0 | 0.0 | |
about 1 year ago | over 1 year ago | |
Python | Python | |
Apache License 2.0 | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datajob
Posts with mentions or reviews of datajob.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-04-04.
- Build and deploy a serverless data pipeline on AWS with no effort.
-
Datajob: Build and deploy a serverless data pipeline on AWS with no effort.
Thanks! triggering a pipeline run based on a schedule is one of the ideas to implement next: https://github.com/vincentclaes/datajob#ideas
-
Aws Glue Environmentsdlc
I created an open source library called `datajob` to deploy and orchestrate glue jobs. You can find it on github https://github.com/vincentclaes/datajob and on pypi
DataEngineeringProject
Posts with mentions or reviews of DataEngineeringProject.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-11-18.
- What are your favourite GitHub repos that shows how data engineering should be done?
- Is it me or are beginner-friendly ETL pipeline guides that explain from the ground-up how to incorporate the use of various technologies notoriously difficult to find.
-
Starting A Data Engineering Project Series
News RSS Feeds
-
5 Data Sources for Data Engineering Projects
Lastly, the most readily available data source would be data scraped from the internet. To be slightly less vague, I have outlined a project that web-scrapes new online articles every ten minutes to provide all the latest news curated into one place. This project utilizes a wide variety of relevant data engineering tools, which makes it a great project example. The author of this project is Damian Kliś, and he outlines his model architecture below:
-
Can You Recommend Good Data Engineering Projects
Here is my project that got me a few interviews so far: https://github.com/damklis/DataEngineeringProject
What are some alternatives?
When comparing datajob and DataEngineeringProject you can also consider the following projects:
tributary - Streaming reactive and dataflow graphs in Python
blinkist-scraper - 📚 Python tool to download book summaries and audio from Blinkist.com, and generate some pretty output