hydra
hydra
hydra | hydra | |
---|---|---|
14 | 26 | |
8,229 | 2,634 | |
1.6% | 4.1% | |
6.3 | 8.5 | |
22 days ago | 4 days ago | |
Python | C | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hydra
- Hydra – a Framework for configuring complex applications
-
Show HN: Hydra - Open-Source Columnar Postgres
Nice tool, only unfortunate name, consider changing it. Already very well know security tool named hydra https://github.com/vanhauser-thc/thc-hydra been around since 2001. Then facebook went ahead and named their config tool hydra https://github.com/facebookresearch/hydra on top of it. Like we get it, hydra popular mythology but we could use more original naming for tools
-
Show HN: Hydra 1.0 – open-source column-oriented Postgres
This looks really impressive, and I'm excited to see how it performs on our data!
P.S., I think the name conflicts with Hydra, the configuration management library: https://hydra.cc/
-
Best practice for saving logits/activation values of model in PyTorch Lightning
I've been trying to learn PyTorch Lightning and Hydra in order to use/create my own custom deep learning template (e.g. like this) as it would greatly help with my research workflow. A lot of the work I do requires me to analyse metrics based on the logits/activations of the model.
-
[D] Alternatives to fb Hydra?
However, hydra seems to have several limitations that are really annoying and are making me reconsider my choice. Most problematic is the inability to group parameters together in a multirun. Hydra only supports trying all combinations of parameters, as described in https://github.com/facebookresearch/hydra/issues/1258, which does not seem to be a priority for hydra. Furthermore, hydras optuna optimizer implementation does not allow for early pruning of bad runs, which while not a deal breaker is definitely a nice to have feature.
-
Show HN: Lightweight YAML Config CLI for Deep Learning Projects
Do you hate the fact that they don't let you return the config file: https://github.com/facebookresearch/hydra/issues/407
-
Config management for deep learning
I kind of built this due to frustrations with Hydra. Hydra is an end to end framework, it locks you into a certain DL project format, it decides logging, model saving and a whole host of things. For example Hydra can do the same config file overwriting that I allow but you have to store the config file with the name config.yaml inside a specific folder. On top of that hydra doesn’t let you return the config file from the main function so you have to put all the major logic in the main function itself (link), the authors claim this is by design. I can find Hydra useful for a mature less experimental project. But in my robotics and ML research, I like being able to write code where I want and integrating it how I want, especially when debugging for which I think this package is useful. TLDR; If you just want the config file functionality use my package, if you want a complete DL project manager use Hydra. While hydra implements this config file functionality, it also adds a lot of restrictions to project structure that you might not like.
-
The YAML Document from Hell
For managing configs of ML experiments (where each experiment can override a base config, and "variant" configs can further override the experiment config, etc), Hydra + Yaml + OmegaConf is really nice.
https://hydra.cc/
I admit I don't fully understand all the advanced options in Hydra, but the basic usage is already very useful. A nice guide is here:
https://florianwilhelm.info/2022/01/configuration_via_yaml_a...
- Hydra - namestitev in osnovna uporaba
- Hydra - namestitevt in osnovna uporaba
hydra
-
Using ClickHouse to scale an events engine
Don't feel bad, lots of people get bitten by not reading all the way down to the bottom of their readme: https://github.com/hydradatabase/hydra/blob/v1.1.2/README.md... While Hydra may very well license their own code Apache 2, they ship the AGPLv3 columnar which to my very best IANAL understanding taints the whole stack and AGPLv3's everything all the way through https://github.com/hydradatabase/hydra/blob/v1.1.2/columnar/...
-
Moving a Billion Postgres Rows on a $100 Budget
Columnar store PostgreSQL extension exists, here are two but I think I’m missing at least another one:
https://github.com/citusdata/cstore_fdw
https://github.com/hydradatabase/hydra
You can also connect other stores using the foreign data wrappers, like parquet files stored on an object store, duckdb, clickhouse… though the joins aren’t optimised as PostgreSQL would do full scan on the external table when joining.
- Hydra (YC W22) adds upsert to columnar Postgres
- Hydra
-
Is ClickHouse Moving Away from Open Source?
New column store alternative : https://github.com/hydradatabase/hydra
HN: https://news.ycombinator.com/item?id=37571974
-
Show HN: Hydra - Open-Source Columnar Postgres
some previous discussions:
https://news.ycombinator.com/item?id=37247945
https://news.ycombinator.com/item?id=36987920
and a relevant observation is that there are actually multiple license files in the repo so the consumer should read their explicit licensing section of the readme <https://github.com/hydradatabase/hydra#license> since the GitHub sidebar is misleading
-
CDC from postgres to postgres.
Hydra DB Link to Github -> Worked well for aggregated query usecases but not for queries that build reports. Also, data insertion and updation is abyssmal on columnar dbs.
-
How Query Engines Work
There's a lot of experience about db operation and how to approach MVCC encoded in PostgreSQL that shouldn't be underestimated.
[0]: https://github.com/hydradatabase/hydra
-
Hydra: Column-Oriented Postgres
And just like last time, watch out for the misleading GitHub license detector because it's not entirely Apache as the GitHub summary claims but rather *some* is Apache and buried in the interior is some AGPL stuff: https://github.com/hydradatabase/hydra#license
What are some alternatives?
dynaconf - Configuration Management for Python ⚙
duckdb - DuckDB is an in-process SQL OLAP Database Management System
ConfigParser
citus - Distributed PostgreSQL as an extension
python-dotenv - Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
ClickHouse - ClickHouse® is a free analytics DBMS for big data
python-decouple - Strict separation of config from code.
postgres - PostgreSQL in Neon
django-environ - Django-environ allows you to utilize 12factor inspired environment variables to configure your Django application.
Udacity-Data-Engineering-Projects - Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
classyconf - Declarative and extensible library for configuration & code separation
vasco - vasco: MIC & MINE statistics for Postgres