[D] How to be more productive while doing Deep Learning experiments?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. aim

    Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

    Log everything, literally everything, including hyperparameters, command-line arguments, environment variables, outputs, checkpoints, resource usage, etc. Decent High-level ML frameworks provide this out-of-the-box. Configure a callback to your trainer to send a notification through Slack. To track and compare your experiments use tools other than just a plain tensorboard. Aim is a fantastic tool to get insights from hundreds of experiments.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. coddx-alpha

    Todo Kanban Board manages tasks and save them as TODO.md - a simple plain text file.

    Yes for deciding the order of experiments, I also like a Kanban board, like the other commenter suggested. There is a VSCode plugin that displays the content of a TODO.md as kanban board: https://github.com/coddx-hq/coddx-alpha

  4. guildai

    Experiment tracking, ML developer tools

    There are a number of experiment tracking systems out there. mlflow, wandb, Guild AI, etc. (disclaimer I developed Guild). I would look at adopting one of those. While you can roll your own experiment tracking tool, there's just no point IMO.

  5. detectron2

    Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

    http://karpathy.github.io/2019/04/25/recipe/ I sense that your experiments are not very organised. I would recommend using a configuration approach, where each experiment can be described by config such as https://github.com/facebookresearch/detectron2/blob/master/detectron2/config/config.py, see https://github.com/facebookresearch/detectron2/tree/master/configs for example of usage. Most experiments should only require changing parameters in main config. For experiments that require code changes, use git branches to try and if they are successful implement them as config keys.

  6. Sacred

    Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.

    For 1, setup an experiment tracking framework. I found Sacred to be helpful https://github.com/IDSIA/sacred.

  7. metaflow

    Build, Deploy and Manage AI/ML Systems

    For building experiments as a DAG, I suggest Metaflow from Netflix. I like the ability to resume if I make a mistake. Make sure you tag your runs so you can always filter runs that had a flaw in them.

  8. nvidia-gpu-scheduler

    NVIDIA GPU compute task scheduling utility

    Sure. No, a simple bash script is not enough. In my case, we have several machines shared in the department, some with GPUs, some without. What I have is a python script that gets a list of jobs and then it schedule them in the first available machine (according to memory/CPU/GPU availability). Unfortunately, what I have is really entangled with our computing platform (Docker-based with a shared filesystem) and not really easy to have it as standalone project (that's why I said "know you infrastructure"). The most similar thing that I could find online is this project. I believe there are then some HPC tools that could be useful (e.g. Slurm), but that's way too much for what we need.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. tmux

    tmux source code

    Try to avoid jupyter notebooks, use them only for very preliminary experiments to save time... But for the long-run, use decent IDEs (vscode, PyCharm) can easily help you to stay away from stupid bugs. PyCharm has stunning Python language support, while open-source VSCode, Insiders Channel makes it very easy to code, run and debug remotely. Use Mosh or Eternal Terminal to prevent disconnection even if your computer is asleep/disconnected from the internet, use tmux to run tasks when you're away. You can use your smartphone to always stay connected to the same tmux session and monitor the training.

  11. pytorch-lightning

    Discontinued Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning] (by PyTorchLightning)

    First of all, use high-level ML frameworks (AllenNLP, PyTorch-Lightning). No need to write boilerplate code and implement standard ML approaches from scratch. Here are some suggestions (thought more NLP-focused) that I feel improved my research coding experience a lot.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • 10 Open Source MLOps Projects You Didn’t Know About

    12 projects | dev.to | 1 Aug 2024
  • Ask HN: What's the right tool for this job?

    4 projects | news.ycombinator.com | 20 Jul 2024
  • 25 Open Source AI Tools to Cut Your Development Time in Half

    8 projects | dev.to | 11 Jul 2024
  • Building an Email Assistant Application with Burr

    6 projects | dev.to | 26 Apr 2024
  • In Need of Guidance: Implementing MLOps in a Complex Organization as a Junior Data Engineer

    2 projects | /r/mlops | 12 Jun 2023

Did you know that Python is
the 2nd most popular programming language
based on number of references?