Udacity-Data-Engineering-Projects
prefect-deployment-patterns
Our great sponsors
Udacity-Data-Engineering-Projects | prefect-deployment-patterns | |
---|---|---|
5 | 1 | |
1,295 | 93 | |
- | - | |
0.0 | 0.0 | |
over 1 year ago | over 1 year ago | |
Python | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Udacity-Data-Engineering-Projects
- Pitanje za data engineering?
-
✨ 5 Free Resources to Learn Data Engineering 🚀
🔗 https://github.com/san089/Udacity-Data-Engineering-Projects
-
How can I become a big data engineer?
You can start with googling data engineering learning path to get a sense of what you need to know. If you are looking for simple projects to start with then you can look at this as well (https://github.com/san089/Udacity-Data-Engineering-Projects).
-
Beginner DE projects.
For practice, Data Modeling with Postgres and Udacity Data Engineering Projects as examples, and Data Engineering Project for Beginners - Batch edition for a guided tutorial.
- Data Pipeline Examples in Action
prefect-deployment-patterns
-
[D] Should I go with Prefect, Argo or Flyte for Model Training and ML workflow orchestration?
Have you used infrastructure blocks in Prefect? You could easily build a block for Sagemaker deploying infrastructure for the flow running with GPUs, then run other flow in a local process, yet another one as Kubernetes job, Docker container, ECS task, AWS batch, etc. Super easy to set up, even from the UI or from CI/CD. There are a bunch of templates and examples here: https://github.com/anna-geller/prefect-deployment-patterns
What are some alternatives?
hydra - Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
Taipy - Turns Data and AI algorithms into production-ready web applications in no time.
data-engineering-zoomcamp - Free Data Engineering course!
buildflow - BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest. No configuration outside of the code is required.
data-engineering-book - Accumulated knowledge and experience in the field of Data Engineering
weather_data_pipeline - This is a PySpark-based data pipeline that fetches weather data for a few cities, performs some basic processing and transformation on the data, and then writes the processed data to a Google Cloud Storage bucket and a BigQuery table.The data is then viewed in a looker dashboard
ask-astro - An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
canarypy - CanaryPy - A light and powerful canary release for Data Pipelines
pg-counter-metrics - PG Counter Metrics ( PGCM ) is a tool for publishing PostgreSQL performance data to CloudWatch. By publishing to CloudWatch, dashboards and alarming can be used on the collected data.
f1-data-pipeline - F1 Data Pipeline
dataall - A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.