Sacred
nvidia-gpu-scheduler
Sacred | nvidia-gpu-scheduler | |
---|---|---|
6 | 1 | |
4,158 | 7 | |
0.2% | - | |
3.5 | 0.0 | |
3 months ago | over 1 year ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Sacred
-
Sacred VS cascade - a user suggested alternative
2 projects | 5 Dec 2023
-
✨ 7 Best Machine Learning Experiment Logging Tools in 2022 🚀
🔗 https://github.com/IDSIA/sacred
-
https://np.reddit.com/r/MachineLearning/comments/pvs8r5/d_facebook_visdom_vs_google_tensorboard_for/hefg131/
I'm using Omniboard (https://github.com/vivekratnavel/omniboard) with Sacred (https://github.com/IDSIA/sacred) for tracking experiments. You can specify custom Observers in Sacred so the model metrics and logs will be saved to a local directory or to a remote DB (e.g., MongoDB). I use a MongoDB database hosted on Atlas. Unlike other suggested options, Sacred and Omniboard are free. Atlas free tier comes with 512MB of free storage which is a huge amount if you're uploading only log files to it.
-
[D] Facebook Visdom vs Google Tensorboard for Pytorch
I'm using Omniboard (https://github.com/vivekratnavel/omniboard) with Sacred (https://github.com/IDSIA/sacred) for tracking experiments. You can specify custom Observers in Sacred so the model metrics and logs will be saved to a local directory or to a remote DB (e.g., MongoDB). I use a MongoDB database hosted on Atlas. Unlike other suggested options, Sacred and Omniboard are free. Atlas free tier comes with 512MB of free storage which is a huge amount if you're uploading only log files to it. ex = Experiment() ex.observers.append(FileStorageObserver(EXPERIMENTS_ROOT)) ex.observers.append(MongoObserver(url=MONGODB_URL, db_name='sacred'))
-
Can someone tell me good libraries you use on a day to day basis that increases your research productivity in ML/AI?
sacred helped me log my experiments. I did setup my environment only once 4 years ago, and since then I have a list of all my training runs with the hyperparameters and results.
-
[D] How to be more productive while doing Deep Learning experiments?
For 1, setup an experiment tracking framework. I found Sacred to be helpful https://github.com/IDSIA/sacred.
nvidia-gpu-scheduler
-
[D] How to be more productive while doing Deep Learning experiments?
Sure. No, a simple bash script is not enough. In my case, we have several machines shared in the department, some with GPUs, some without. What I have is a python script that gets a list of jobs and then it schedule them in the first available machine (according to memory/CPU/GPU availability). Unfortunately, what I have is really entangled with our computing platform (Docker-based with a shared filesystem) and not really easy to have it as standalone project (that's why I said "know you infrastructure"). The most similar thing that I could find online is this project. I believe there are then some HPC tools that could be useful (e.g. Slurm), but that's way too much for what we need.
What are some alternatives?
MLflow - Open source platform for the machine learning lifecycle
detectron2 - Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
pytorch-lightning - Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning]
fastapi-cloud-tasks - GCP's Cloud Tasks + Cloud Scheduler + FastAPI = Partial replacement for celery.
tensorflow - An Open Source Machine Learning Framework for Everyone
stable-diffusion-nvidia-docker - GPU-ready Dockerfile to run Stability.AI stable-diffusion model v2 with a simple web interface. Includes multi-GPUs support.
Keras - Deep Learning for humans
Clairvoyant - Software designed to identify and monitor social/historical cues for short term stock movement
tmux - tmux source code
scikit-learn - scikit-learn: machine learning in Python
aim - Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.