maxim
maxvit
maxim | maxvit | |
---|---|---|
1 | 1 | |
943 | 421 | |
1.6% | 1.9% | |
0.0 | 0.0 | |
11 months ago | 11 months ago | |
Python | Jupyter Notebook | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
maxim
-
GOOGLE new computer vision multi-axis approach improves high level tasks, such as object detection, as well as motion deblurring, denoising, deraining
Today we present a new multi-axis approach that is simple and effective, improves on the original ViT and MLP models, can better adapt to high-resolution, dense prediction tasks, and can naturally adapt to different input sizes with high flexibility and low complexity. Based on this approach, we have built two backbone models for high-level and low-level vision tasks. We describe the first in “MaxViT: Multi-Axis Vision Transformer”, to be presented in ECCV 2022, and show it significantly improves the state of the art for high-level tasks, such as image classification, object detection, segmentation, quality assessment, and generation. The second, presented in “MAXIM: Multi-Axis MLP for Image Processing” at CVPR 2022, is based on a UNet-like architecture and achieves competitive performance on low-level imaging tasks including denoising, deblurring, dehazing, deraining, and low-light enhancement. To facilitate further research on efficient Transformer and MLP models, we have open-sourced the code and models for both MaxViT and MAXIM.
maxvit
-
GOOGLE new computer vision multi-axis approach improves high level tasks, such as object detection, as well as motion deblurring, denoising, deraining
Today we present a new multi-axis approach that is simple and effective, improves on the original ViT and MLP models, can better adapt to high-resolution, dense prediction tasks, and can naturally adapt to different input sizes with high flexibility and low complexity. Based on this approach, we have built two backbone models for high-level and low-level vision tasks. We describe the first in “MaxViT: Multi-Axis Vision Transformer”, to be presented in ECCV 2022, and show it significantly improves the state of the art for high-level tasks, such as image classification, object detection, segmentation, quality assessment, and generation. The second, presented in “MAXIM: Multi-Axis MLP for Image Processing” at CVPR 2022, is based on a UNet-like architecture and achieves competitive performance on low-level imaging tasks including denoising, deblurring, dehazing, deraining, and low-light enhancement. To facilitate further research on efficient Transformer and MLP models, we have open-sourced the code and models for both MaxViT and MAXIM.
What are some alternatives?
maxim-pytorch - [CVPR 2022 Oral] PyTorch re-implementation for "MAXIM: Multi-Axis MLP for Image Processing", with *training code*. Official Jax repo: https://github.com/google-research/maxim
vision_transformer_tf - This repository contains the TensorFlow implementation of the paper "AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE" known as vision transformers.
GIMP-ML - AI for GNU Image Manipulation Program
Azure-Computer-Vision-in-a-day-workshop - Azure Computer Vision 4 (March 2023 - Florence) workshop in a day
swin2sr - Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration. Advances in Image Manipulation (AIM) workshop ECCV 2022. Try it out! over 3.3M runs https://replicate.com/mv-lab/swin2sr
vision-transformer-from-scratch - A Simplified PyTorch Implementation of Vision Transformer (ViT)
friendship-globe
astrophotography_stack_align - Align sequence of star field / astro images taken with a stationary camera (stationary relative to all those stars light years away).
Restormer - [CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
optc-box-exporter - Export your One Piece Treasure Cruise Box with just using Screenshots
Windows-terminal-context-menu - 📃 This is a simple script to add right click context menu for your best Windows terminal ❤
liga-pytorch - Let Data Dance with PyTorch Models