WeightWatcher
delve
WeightWatcher | delve | |
---|---|---|
4 | 1 | |
1,393 | 77 | |
0.4% | - | |
9.1 | 4.0 | |
20 days ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
WeightWatcher
-
Ask HN: Have you seen anything original produced by generative AI?
These models are pretty much always extrapolating [0]
Whether the extrapolation is crude/low-rank or astute/high-rank is a question of memorization vs generalization. That gets into the question of whether or not the model is over-fitted or under-fitted. There are certain heuristics borrowed from high dimensional statistical physics that can be used to guess how good the test performance of a model will be on a typical task without even knowing what the test data is [1].
Originality for me means finding better answers to sub-tasks, and then combining those answers together in a better way. This is the nirvana of cross-entropy minimization - the emergence of capability results from gaining the ability to amass a wider range of skills, improving upon them, and percolating those improvements towards multiply the leverage of other skills.
How long such a thing can keep improving with current tech, who knows, but you should really think critically about whether that sounds just like interpolation through the corpus.
[0] Learning in High Dimension Always Amounts to Extrapolation - https://arxiv.org/abs/2110.09485
[1] https://github.com/CalculatedContent/WeightWatcher
-
Physics and Machine Learning
One of the things I love about physics is that, in addition to probably being my favorite of study in it's own right, it seems that a lot of the conceptual/mathematical content carries over and contributes to other fields. One example I've come across recently can be found here: https://github.com/CalculatedContent/WeightWatcher and here:
- [D] DL Practitioners, Do You Use Layer Visualization Tools s.a GradCam in Your Process?
- A New Link to an Old Model Could Crack the Mystery of Deep Learning
delve
What are some alternatives?
captum - Model interpretability and understanding for PyTorch
cleverhans - An adversarial example library for constructing attacks, building defenses, and benchmarking both
TorchDrift - Drift Detection for your PyTorch Models
only_train_once - OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
pytea - PyTea: PyTorch Tensor shape error analyzer
cockpit - Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
explainerdashboard - Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
Transformer-MM-Explainability - [ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
backpack - BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.