TorchDrift
WeightWatcher
Our great sponsors
TorchDrift | WeightWatcher | |
---|---|---|
1 | 4 | |
302 | 1,392 | |
0.0% | 1.5% | |
0.0 | 9.2 | |
over 1 year ago | 17 days ago | |
Jupyter Notebook | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
TorchDrift
WeightWatcher
-
Ask HN: Have you seen anything original produced by generative AI?
These models are pretty much always extrapolating [0]
Whether the extrapolation is crude/low-rank or astute/high-rank is a question of memorization vs generalization. That gets into the question of whether or not the model is over-fitted or under-fitted. There are certain heuristics borrowed from high dimensional statistical physics that can be used to guess how good the test performance of a model will be on a typical task without even knowing what the test data is [1].
Originality for me means finding better answers to sub-tasks, and then combining those answers together in a better way. This is the nirvana of cross-entropy minimization - the emergence of capability results from gaining the ability to amass a wider range of skills, improving upon them, and percolating those improvements towards multiply the leverage of other skills.
How long such a thing can keep improving with current tech, who knows, but you should really think critically about whether that sounds just like interpolation through the corpus.
[0] Learning in High Dimension Always Amounts to Extrapolation - https://arxiv.org/abs/2110.09485
[1] https://github.com/CalculatedContent/WeightWatcher
-
Physics and Machine Learning
One of the things I love about physics is that, in addition to probably being my favorite of study in it's own right, it seems that a lot of the conceptual/mathematical content carries over and contributes to other fields. One example I've come across recently can be found here: https://github.com/CalculatedContent/WeightWatcher and here:
- [D] DL Practitioners, Do You Use Layer Visualization Tools s.a GradCam in Your Process?
- A New Link to an Old Model Could Crack the Mystery of Deep Learning
What are some alternatives?
Transformer-MM-Explainability - [ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
captum - Model interpretability and understanding for PyTorch
cockpit - Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
pytea - PyTea: PyTorch Tensor shape error analyzer
uncertainty-toolbox - Uncertainty Toolbox: a Python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization
loss-landscape - Code for visualizing the loss landscape of neural nets
explainerdashboard - Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
cleverhans - An adversarial example library for constructing attacks, building defenses, and benchmarking both