MLFlow users, what would you want from an integration with GitLab?

This page summarizes the projects mentioned and recommended in the original post on /r/mlops

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • I took a look at few different options, the main issue is that GitLab is ruby, while most options (like nbdime) are in python. It also needs to work by default, so zero effort for the user. What I did was create a markdown from each version, cleaning up a bit metadata and some noisy outputs, and diff them (https://gitlab.com/gitlab-org/incubation-engineering/mlops/rb-ipynbdiff). It's an MVP, but it works well enough and allows for diffing output as well (I will be adding some metadata back soon too). The next step is create a semantic diff algorithm over the JSON tree, and actually render the diffs per cell.

  • databooks

    A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

  • If you're working on diffs for Jupyter Notebooks, it's worth looking into this: https://github.com/datarootsio/databooks

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • orchest

    Build data pipelines, the easy way 🛠️

  • ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

  • I recommend checking out ploomber, great integration with notebooks and Git!

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [D] What MLOps platform do you use, and how helpful are they?

    3 projects | /r/MachineLearning | 24 Mar 2022
  • How do I number my .py file names?

    2 projects | /r/learnpython | 7 Feb 2022
  • Launch HN: Ploomber (YC W22) – Quickly Deploy Data Pipelines from Jupyter/VSCode

    4 projects | news.ycombinator.com | 3 Feb 2022
  • Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor)

    2 projects | news.ycombinator.com | 6 Dec 2023
  • Decent low code options for orchestration and building data flows?

    1 project | /r/dataengineering | 23 Dec 2022