hydra
lightning-hydra-template
hydra | lightning-hydra-template | |
---|---|---|
14 | 9 | |
8,229 | 3,674 | |
1.6% | - | |
6.3 | 5.1 | |
21 days ago | about 2 months ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hydra
- Hydra β a Framework for configuring complex applications
-
Show HN: Hydra - Open-Source Columnar Postgres
Nice tool, only unfortunate name, consider changing it. Already very well know security tool named hydra https://github.com/vanhauser-thc/thc-hydra been around since 2001. Then facebook went ahead and named their config tool hydra https://github.com/facebookresearch/hydra on top of it. Like we get it, hydra popular mythology but we could use more original naming for tools
-
Show HN: Hydra 1.0 β open-source column-oriented Postgres
This looks really impressive, and I'm excited to see how it performs on our data!
P.S., I think the name conflicts with Hydra, the configuration management library: https://hydra.cc/
-
Best practice for saving logits/activation values of model in PyTorch Lightning
I've been trying to learn PyTorch Lightning and Hydra in order to use/create my own custom deep learning template (e.g. like this) as it would greatly help with my research workflow. A lot of the work I do requires me to analyse metrics based on the logits/activations of the model.
-
[D] Alternatives to fb Hydra?
However, hydra seems to have several limitations that are really annoying and are making me reconsider my choice. Most problematic is the inability to group parameters together in a multirun. Hydra only supports trying all combinations of parameters, as described in https://github.com/facebookresearch/hydra/issues/1258, which does not seem to be a priority for hydra. Furthermore, hydras optuna optimizer implementation does not allow for early pruning of bad runs, which while not a deal breaker is definitely a nice to have feature.
-
Show HN: Lightweight YAML Config CLI for Deep Learning Projects
Do you hate the fact that they don't let you return the config file: https://github.com/facebookresearch/hydra/issues/407
-
Config management for deep learning
I kind of built this due to frustrations with Hydra. Hydra is an end to end framework, it locks you into a certain DL project format, it decides logging, model saving and a whole host of things. For example Hydra can do the same config file overwriting that I allow but you have to store the config file with the name config.yaml inside a specific folder. On top of that hydra doesnβt let you return the config file from the main function so you have to put all the major logic in the main function itself (link), the authors claim this is by design. I can find Hydra useful for a mature less experimental project. But in my robotics and ML research, I like being able to write code where I want and integrating it how I want, especially when debugging for which I think this package is useful. TLDR; If you just want the config file functionality use my package, if you want a complete DL project manager use Hydra. While hydra implements this config file functionality, it also adds a lot of restrictions to project structure that you might not like.
-
The YAML Document from Hell
For managing configs of ML experiments (where each experiment can override a base config, and "variant" configs can further override the experiment config, etc), Hydra + Yaml + OmegaConf is really nice.
https://hydra.cc/
I admit I don't fully understand all the advanced options in Hydra, but the basic usage is already very useful. A nice guide is here:
https://florianwilhelm.info/2022/01/configuration_via_yaml_a...
- Hydra - namestitev in osnovna uporaba
- Hydra - namestitevt in osnovna uporaba
lightning-hydra-template
- User-friendly PyTorch Lightning and Hydra template for ML experimentation
-
Best practice for saving logits/activation values of model in PyTorch Lightning
I've been trying to learn PyTorch Lightning and Hydra in order to use/create my own custom deep learning template (e.g. like this) as it would greatly help with my research workflow. A lot of the work I do requires me to analyse metrics based on the logits/activations of the model.
-
[D] Is Pytorch Lightning + Wandb a good combination for research?
I can't say for sure whether it is the best combination for research in the long run, but if you do go down that route I have found this template very useful
-
How research scientists structure their code ?
lightning-hydra-template
-
[D] Any research specific PyTorch based boilerplate code?
This lightning + hydra template is quite complete. Great for learning best practices.
-
Typing and testing for torch
A good example is this project template https://github.com/ashleve/lightning-hydra-template. It uses a lot of cool things such as
-
Our template to kickstart your pytorch projects, with list of best practices. Minimal boilerplate code. Leverages Lightning + Hydra. Focused on scalability, reproducibility and fast experimentation.
and many more! (checkout the #Your Superpowers section of the readme)
-
General and feature-rich PyTorch/Hydra project template for rapid and scalable ML experimentation, with a list of best practices
I write a LightningDatamodule. I found it to be an intuitive way to encapsulate any dataset. LightningDatamodule is a simple abstraction providing methods for data download, split, transforms and exposing dataloaders. Would love to see more researchers try out this concept, even in projects which don't use pytorch lightning. Reading LightningDatamodule makes me immedietely see how the dataset is prepared, while it seems like most data science projects throw around data logic across different parts of the pipeline, making it hard to understand what's going on. You can see example of such datamodule here
-
[P] General and feature-rich PyTorch/Hydra template for rapid and scalable ML research/experimentation, with a list of best practices
I feel like most ML people don't use those tools because they simply don't realize all the advantages (especially Hydra seems like a very useful addition to any deep learning project). I focused on structuring the readme in a way, which (I hope) will give you a quick overview - my hope is it can help to spread the word about those frameworks in a broaded community. It incorporates best practices and tricks I gathered over the last couple of months of playing around with it.
What are some alternatives?
dynaconf - Configuration Management for Python β
lightning-hydra-template - Deep Learning project template best practices with Pytorch Lightning, Hydra, Tensorboard.
ConfigParser
pytorch_tempest - My repo for training neural nets using pytorch-lightning and hydra
python-dotenv - Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
neptune-client - π The MLOps stack component for experiment tracking
python-decouple - Strict separation of config from code.
lightning-transformers - Flexible components pairing π€ Transformers with :zap: Pytorch Lightning
django-environ - Django-environ allows you to utilize 12factor inspired environment variables to configure your Django application.
traingenerator - π§ A web app to generate template code for machine learning
classyconf - Declarative and extensible library for configuration & code separation
neptune-contrib - This library is a location of the LegacyLogger for PyTorch Lightning.