prefect-deployment-patterns
dataall
prefect-deployment-patterns | dataall | |
---|---|---|
1 | 1 | |
93 | 210 | |
- | 2.4% | |
0.0 | 9.4 | |
over 1 year ago | 3 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
prefect-deployment-patterns
-
[D] Should I go with Prefect, Argo or Flyte for Model Training and ML workflow orchestration?
Have you used infrastructure blocks in Prefect? You could easily build a block for Sagemaker deploying infrastructure for the flow running with GPUs, then run other flow in a local process, yet another one as Kubernetes job, Docker container, ECS task, AWS batch, etc. Super easy to set up, even from the UI or from CI/CD. There are a bunch of templates and examples here: https://github.com/anna-geller/prefect-deployment-patterns
dataall
-
Newsletter martinmueller.dev 2022 week 19
And again a nice weekly summary. My highlight is the aws-dataall as it looks super interesting for sharing ML data company internally and event externally mhh. Lots to think about here. But as always there is tons of other gold so you have to explore that!
What are some alternatives?
Taipy - Turns Data and AI algorithms into production-ready web applications in no time.
projen - Rapidly build modern applications with advanced configuration management
Udacity-Data-Engineering-Projects - Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
flowrunner - Flowrunner is a lightweight package to organize and represent Data Engineering/Science workflows
buildflow - BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest. No configuration outside of the code is required.
covid-19-data-engineering-pipeline - A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
weather_data_pipeline - This is a PySpark-based data pipeline that fetches weather data for a few cities, performs some basic processing and transformation on the data, and then writes the processed data to a Google Cloud Storage bucket and a BigQuery table.The data is then viewed in a looker dashboard
senjuns - Monorepo for wiki, landingpage, AWS CDK code and more for Senjuns. Senjuns is (will be) a freelancer platform for connecting seniors and juniors with clients.
canarypy - CanaryPy - A light and powerful canary release for Data Pipelines
MyVoteAWS - beginner aws project to learn how various components work - build voting app
f1-data-pipeline - F1 Data Pipeline
blueprint-examples - This is where you can find officially supported Cloudify blueprints that work with the latest versions of Cloudify. Please make sure to use the blueprints from this repo when you are evaluating Cloudify.