-
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
-
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
This is still being pursued. Ross Wightmann's timm[0,1] package (now on Hugging Face) has done a lot of this. There's also a V2 of ConvNext[2]. Ross does write about this a lot on Twitter fwiw. I should also mention that there are still many transformer based networks that still beat convs. So there probably won't be a resurgence in convs until someone can show that there's a really strong reason for them. They have some advantages but they also might not be flexible enough for the long range tasks in segmentation and detection. But maybe they are.
FAIR definitely did great work with ConvNext, and I do hope to see more. There always needs to be people pushing unpopular paradigms.
[0] https://github.com/huggingface/pytorch-image-models
[1] https://arxiv.org/abs/2110.00476
[2] https://arxiv.org/abs/2301.00808