ML Experiments Management with Git

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • dvc

    🦉 ML Experiments and Data Management with Git

  • determined

    Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.

  • Use Determined if you want a nice UI https://github.com/determined-ai/determined#readme

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mandala

    A powerful and easy to use Python framework for experiment tracking and incremental computing

  • Another option, that manages versioning of your computational graph and its results and provides extremely elegant query-able memoization is Mandala https://github.com/amakelov/mandala

    It is a much simpler and much more magical piece of software that truly expanded how I think about writing, exploring, and experimenting with code. Even if you never use it, you probably would really enjoy reading the blog posts the author wrote about the design of the tool https://amakelov.github.io/blog/pl/

  • scidataflow

    Command line scientific data management tool

  • I've really liked the idea of scidataflow in this context: https://github.com/vsbuffalo/scidataflow

    It's neat for research as it stores the data on scientific data repositories like Zenodo and you get DOIs.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts