Python vision-transformer

Open-source Python projects categorized as vision-transformer

Top 23 Python vision-transformer Projects

  • mmdetection

    OpenMMLab Detection Toolbox and Benchmark

  • LaTeX-OCR

    pix2tex: Using a ViT to convert images of equations into LaTeX code.

  • Project mention: Detexify LaTeX Handwriting Symbol Recognition | news.ycombinator.com | 2023-11-14
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • SwinIR

    SwinIR: Image Restoration Using Swin Transformer (official repository)

  • Efficient-AI-Backbones

    Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

  • mmpretrain

    OpenMMLab Pre-training Toolbox and Benchmark

  • scenic

    Scenic: A Jax Library for Computer Vision Research and Beyond (by google-research)

  • towhee

    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

  • Project mention: FLaNK Stack Weekly for 14 Aug 2023 | dev.to | 2023-08-14
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • EVA

    EVA Series: Visual Representation Fantasies from BAAI (by baaivision)

  • EasyCV

    An all-in-one toolkit for computer vision

  • Project mention: FLaNK Stack Weekly for 20 June 2023 | dev.to | 2023-06-20

    All in One Computer Vision https://github.com/alibaba/EasyCV

  • VRT

    VRT: A Video Restoration Transformer (official repository)

  • VoxFormer

    Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]

  • InternVideo

    Video Foundation Models & Data for Multimodal Understanding

  • ONE-PEACE

    A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

  • Project mention: A general representation modal across vision, audio, language modalities | news.ycombinator.com | 2023-05-25
  • how-do-vits-work

    (ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"

  • vit-explain

    Explainability for Vision Transformers

  • ImageNet21K

    Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

  • DAT

    Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention (by LeapLabTHU)

  • swin2sr

    Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration. Advances in Image Manipulation (AIM) workshop ECCV 2022. Try it out! over 3.3M runs https://replicate.com/mv-lab/swin2sr

  • thepipe

    Feed PDFs, docs, slides, web pages and more into GPT-4-Vision in one line of code ⚡

  • Project mention: Show HN: I just open sourced my document/website extractor for Vision-LLMs | news.ycombinator.com | 2024-04-02
  • parseq

    Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

  • Project mention: need help for license plate number segmentation | /r/deeplearning | 2023-05-31

    I really recommend the usage of scene text recognition models. They are perfect for these type of usecases: https://github.com/baudm/parseq or check https://paperswithcode.com/task/scene-text-recognition make sure to check the licenses and good luck 👍🏻

  • GCVit

    [ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

  • MPViT

    [CVPR 2022] MPViT:Multi-Path Vision Transformer for Dense Prediction

  • CrossViT

    Official implementation of CrossViT. https://arxiv.org/abs/2103.14899

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python vision-transformer related posts

Index

What are some of the best open-source vision-transformer projects in Python? This list will help you:

Project Stars
1 mmdetection 27,742
2 LaTeX-OCR 10,770
3 SwinIR 4,060
4 Efficient-AI-Backbones 3,783
5 mmpretrain 3,156
6 scenic 2,995
7 towhee 2,989
8 EVA 1,957
9 EasyCV 1,679
10 VRT 1,244
11 VoxFormer 961
12 InternVideo 909
13 ONE-PEACE 838
14 how-do-vits-work 784
15 vit-explain 708
16 ImageNet21K 695
17 DAT 693
18 swin2sr 526
19 thepipe 506
20 parseq 496
21 GCVit 414
22 MPViT 340
23 CrossViT 299

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com