[D] Productionalizing machine learning pipelines for small teams

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • flyte

    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

  • The most important thing is versioning and reproducibility, https://github.com/flyteorg/flyte is an option for data pipelines but is quite complicated. As long as the path from raw data to input to the model is traceable any solution is fine.

  • polyaxon

    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle

  • For running experiments, http://polyaxon.com/ is a really good free open-source package that has lots of nice integrations so you can quickly run experiments in k8s but it might be overkill in some cases.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

  • I wrote a detailed survey on this. However, I'm biased since I have a project of my own: Ploomber.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts