-
pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
-
Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Not exactly the same but since you mentioned using ViT's attention outputs as a 2D feature map for the CAM you can consider this paper (Transformer Interpretability Beyond Attention Visualization) where they study the question of how to choose/mix the attention scores in a way that can be visualized (so similar to the CAMs). Maybe it can lead to better results. https://arxiv.org/abs/2012.09838 https://github.com/hila-chefer/Transformer-Explainability