DIF
maxvit
DIF | maxvit | |
---|---|---|
2 | 1 | |
1 | 421 | |
- | 1.2% | |
3.6 | 0.0 | |
over 2 years ago | 11 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DIF
maxvit
-
GOOGLE new computer vision multi-axis approach improves high level tasks, such as object detection, as well as motion deblurring, denoising, deraining
Today we present a new multi-axis approach that is simple and effective, improves on the original ViT and MLP models, can better adapt to high-resolution, dense prediction tasks, and can naturally adapt to different input sizes with high flexibility and low complexity. Based on this approach, we have built two backbone models for high-level and low-level vision tasks. We describe the first in “MaxViT: Multi-Axis Vision Transformer”, to be presented in ECCV 2022, and show it significantly improves the state of the art for high-level tasks, such as image classification, object detection, segmentation, quality assessment, and generation. The second, presented in “MAXIM: Multi-Axis MLP for Image Processing” at CVPR 2022, is based on a UNet-like architecture and achieves competitive performance on low-level imaging tasks including denoising, deblurring, dehazing, deraining, and low-light enhancement. To facilitate further research on efficient Transformer and MLP models, we have open-sourced the code and models for both MaxViT and MAXIM.
What are some alternatives?
RagTag - Tools for fast and flexible genome assembly scaffolding and improvement
maxim - [CVPR 2022 Oral] Official repository for "MAXIM: Multi-Axis MLP for Image Processing". SOTA for denoising, deblurring, deraining, dehazing, and enhancement.
vision_transformer_tf - This repository contains the TensorFlow implementation of the paper "AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE" known as vision transformers.
Azure-Computer-Vision-in-a-day-workshop - Azure Computer Vision 4 (March 2023 - Florence) workshop in a day
vision-transformer-from-scratch - A Simplified PyTorch Implementation of Vision Transformer (ViT)
astrophotography_stack_align - Align sequence of star field / astro images taken with a stationary camera (stationary relative to all those stars light years away).
optc-box-exporter - Export your One Piece Treasure Cruise Box with just using Screenshots
liga-pytorch - Let Data Dance with PyTorch Models