Any MLOps platform you use?

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted

Our great sponsors
  • Sonar - Write Clean Python Code. Always.
  • InfluxDB - Access the most powerful time series database as a service
  • SaaSHub - Software Alternatives and Reviews
  • neptune-client

    :ledger: Experiment tracking tool and model registry

    Neptune.ai, which promises to streamline your workflows and make collaboration a breeze.

  • manifests

    A repository for Kustomize manifests

    That said I personally use Kubeflow hosted on a local baremetal kubernetes cluster (8 nodes, 4 gpus), but a lot of it is a bit of a bear to get installed correctly in a multi-machine environment (specifically this issue is still open and exposing the built-in dashboards outside of the cluster is a problem). Also because it's a Google product it's very clearly intended to run in the cloud with self-hosting being very much an afterthought

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • polyaxon

    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle

    If you're not concerned about self-hosting, WandB is one of the more fully featured training monitoring tools (I've used it in the past without any issues but the lack of data and training privacy and lack of self-hosting possibilities makes it a hard no for anything that isn't scholastic). Polyaxon is an alternative but rewriting all your variable logging to conform to their requirements makes it very difficult to switch to it in the middle of a project so you have to commit to it from the get-go.

  • MLflow

    Open source platform for the machine learning lifecycle

    I have an old labmate who uses a similar setup with MLFlow and can endorse it.

  • aim

    Aim 💫 — An easy-to-use & supercharged open-source AI metadata tracker (experiment tracking, AI agents tracing)

    Check out Aim: https://github.com/aimhubio/aim

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts