HugsVision
Transformer-Explainability
Our great sponsors
HugsVision | Transformer-Explainability | |
---|---|---|
1 | 1 | |
188 | 1,660 | |
- | - | |
0.0 | 0.0 | |
9 months ago | 3 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
HugsVision
-
[R] HugsVision: A easy-to-use HuggingFace wrapper for computer vision
Find more tutorials and informations about HugsVision on GitHub
Transformer-Explainability
-
[Project] Recent Class Activation Map Methods for CNNs and Vision Transformers
Not exactly the same but since you mentioned using ViT's attention outputs as a 2D feature map for the CAM you can consider this paper (Transformer Interpretability Beyond Attention Visualization) where they study the question of how to choose/mix the attention scores in a way that can be visualized (so similar to the CAMs). Maybe it can lead to better results. https://arxiv.org/abs/2012.09838 https://github.com/hila-chefer/Transformer-Explainability
What are some alternatives?
poolformer - PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
pytorch-grad-cam - Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Real-time-Object-Detection-for-Autonomous-Driving-using-Deep-Learning - My Computer Vision project from my Computer Vision Course (Fall 2020) at Goethe University Frankfurt, Germany. Performance comparison between state-of-the-art Object Detection algorithms YOLO and Faster R-CNN based on the Berkeley DeepDrive (BDD100K) Dataset.
shap - A game theoretic approach to explain the output of any machine learning model.
fashionpedia-api - Python API for Fashionpedia Dataset
T2T-ViT - ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
ganspace - Discovering Interpretable GAN Controls [NeurIPS 2020]
multi-label-sentiment-classifier - How to build a multi-label sentiment classifiers with Tez and PyTorch
CoordConv
tf-metal-experiments - TensorFlow Metal Backend on Apple Silicon Experiments (just for fun)
Vision-Project-Image-Segmentation
deep-text-recognition-benchmark - PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)