Azure-Computer-Vision-in-a-day-workshop
maxvit
Azure-Computer-Vision-in-a-day-workshop | maxvit | |
---|---|---|
1 | 1 | |
42 | 421 | |
- | 1.9% | |
7.2 | 0.0 | |
about 1 year ago | 11 months ago | |
Jupyter Notebook | Jupyter Notebook | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Azure-Computer-Vision-in-a-day-workshop
-
Azure Cognitive for Vision - Large Foundation Model
To better illustrate this concept, I'm sharing this GitHub code that demonstrates the functionality in action: Azure-ComputerVision-ImageRetrieval. Additionally, I highly recommend this repository by Serge Retkowsky. In there, you'll find fantastic examples and demonstrations to kick-start your learning journey in the world of Azure Cognitive Services for Vision.
maxvit
-
GOOGLE new computer vision multi-axis approach improves high level tasks, such as object detection, as well as motion deblurring, denoising, deraining
Today we present a new multi-axis approach that is simple and effective, improves on the original ViT and MLP models, can better adapt to high-resolution, dense prediction tasks, and can naturally adapt to different input sizes with high flexibility and low complexity. Based on this approach, we have built two backbone models for high-level and low-level vision tasks. We describe the first in “MaxViT: Multi-Axis Vision Transformer”, to be presented in ECCV 2022, and show it significantly improves the state of the art for high-level tasks, such as image classification, object detection, segmentation, quality assessment, and generation. The second, presented in “MAXIM: Multi-Axis MLP for Image Processing” at CVPR 2022, is based on a UNet-like architecture and achieves competitive performance on low-level imaging tasks including denoising, deblurring, dehazing, deraining, and low-light enhancement. To facilitate further research on efficient Transformer and MLP models, we have open-sourced the code and models for both MaxViT and MAXIM.
What are some alternatives?
ocrpy - OCR, Archive, Index and Search: Implementation agnostic OCR framework.
maxim - [CVPR 2022 Oral] Official repository for "MAXIM: Multi-Axis MLP for Image Processing". SOTA for denoising, deblurring, deraining, dehazing, and enhancement.
Azure-ComputerVision-ImageAnalysis - Azure-ComputerVision- Caption and Dense captions (version 4.0 preview)
vision_transformer_tf - This repository contains the TensorFlow implementation of the paper "AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE" known as vision transformers.
Azure-ComputerVision-ImageRetrieval - Azure-ComputerVision-ImageRetrieval (version 4.0 preview)
vision-transformer-from-scratch - A Simplified PyTorch Implementation of Vision Transformer (ViT)
computervision-recipes - Best Practices, code samples, and documentation for Computer Vision.
astrophotography_stack_align - Align sequence of star field / astro images taken with a stationary camera (stationary relative to all those stars light years away).
mmagic - OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
optc-box-exporter - Export your One Piece Treasure Cruise Box with just using Screenshots
liga-pytorch - Let Data Dance with PyTorch Models