Python vision-transformer

Open-source Python projects categorized as vision-transformer

Top 23 Python vision-transformer Projects

vision-transformer
  • mmdetection

    OpenMMLab Detection Toolbox and Benchmark

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • LaTeX-OCR

    pix2tex: Using a ViT to convert images of equations into LaTeX code.

    Project mention: Show HN: Synthesize TikZ Graphics Programs for Scientific Figures and Sketches | news.ycombinator.com | 2024-06-06

    already claim to (at least partially) support this.

    [1] https://github.com/lukas-blecher/LaTeX-OCR

  • omniparse

    Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

    Project mention: Show HN: I Made an Open Source Platform for Structuring Any Unstructured Data | news.ycombinator.com | 2024-07-02
  • SwinIR

    SwinIR: Image Restoration Using Swin Transformer (official repository)

    Project mention: A smooth and sharp image interpolation you probably haven't heard of | news.ycombinator.com | 2024-05-02
  • Efficient-AI-Backbones

    Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

  • mmpretrain

    OpenMMLab Pre-training Toolbox and Benchmark

  • scenic

    Scenic: A Jax Library for Computer Vision Research and Beyond (by google-research)

  • towhee

    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

  • EVA

    EVA Series: Visual Representation Fantasies from BAAI (by baaivision)

  • EasyCV

    An all-in-one toolkit for computer vision

  • VRT

    VRT: A Video Restoration Transformer (official repository)

  • InternVideo

    [ECCV2024] Video Foundation Models & Data for Multimodal Understanding

  • thepipe

    Extract clean data from anywhere, powered by vision-language models ⚡

    Project mention: AIM Weekly for 07 Oct 2024 | dev.to | 2024-10-07

    🫶 Building Resilient AI Infrastructure: Deep Dive Zilliz Cloud's New Production-Ready Features 🙅 Contributing to Open Source 🛠️ Upcoming Data Engineering Best Practices for AI 📝 Building Scalable Image Retrieval 💫 NASA and IBM Weather Model 🙌 Improve Rag with Knowledge Graphs 🦾 Leader 📎 Evaluating RAG 🚙 Solid Data Curation 🤖 Sparse and Dense Embeddings 🍔 Cohere LLM University 📢 DataFormer for Synthetic Data 📢 PDF2Audio 📊 Screenpipe 📱 Vector DB Bencmarks 🛼 Extreme Quantization 📢 AI Powered Question & Answering 🐈‍⬛ Building LLMS Stanford Class 🌐 New Python Web UI 📊 Visualize RAG 🌐 Free Map Hosting 📊 Pipefunc 🖥️ The Pipe to extract 👽 New Audio Model 🧐 Easy Milvus Schema Generation 👽 Multimodal Models 72B 🌐 Fivetran + Milvus 🗣️ JSON Viewer 👽 ONNX Runtime GenAI 🚙 LLM Explorer 🦾 Interesting Computer Vision Techniques 📊 Build a model from embedding 🧩 Superchunk 👽 LLM Eval - Salesforce 🍔 Small AMD Model 🔥 Comfy UI 🔥 Molmo is a family of open vision-language models developed by the Allen Institute for AI. Molmo models are trained on PixMo

  • VoxFormer

    Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]

  • ONE-PEACE

    A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

  • vit-explain

    Explainability for Vision Transformers

  • how-do-vits-work

    (ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"

  • DAT

    Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention (by LeapLabTHU)

  • ImageNet21K

    Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

  • swin2sr

    [ECCV] Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration. Advances in Image Manipulation (AIM) workshop ECCV 2022. Try it out! over 3.3M runs https://replicate.com/mv-lab/swin2sr

  • parseq

    Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

  • GCVit

    [ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

  • MPViT

    [CVPR 2022] MPViT:Multi-Path Vision Transformer for Dense Prediction

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python vision-transformer discussion

Log in or Post with

Python vision-transformer related posts

  • Extracting Data from Tricky PDFs with Google Gemini in 10 lines of Python

    1 project | dev.to | 18 Jul 2024
  • Web Extraction with Vision-LLMs Done the Right Way: Structured Data From Any URL with GPT-4o

    1 project | dev.to | 22 May 2024
  • Show HN: I just open sourced my document/website extractor for Vision-LLMs

    2 projects | news.ycombinator.com | 2 Apr 2024
  • [Demo] Watch Videos with ChatGPT

    7 projects | /r/ChatGPT | 19 Apr 2023
  • [D] Off-the-shelf image saliency scoring models?

    2 projects | /r/MachineLearning | 17 Apr 2023
  • Scratch Implementation of Vision Transformer in PyTorch

    2 projects | /r/computervision | 13 Apr 2023
  • [R] InternVideo: General Video Foundation Models via Generative and Discriminative Learning

    1 project | /r/MachineLearning | 10 Apr 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 12 Oct 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source vision-transformer projects in Python? This list will help you:

Project Stars
1 mmdetection 29,244
2 LaTeX-OCR 12,192
3 omniparse 5,135
4 SwinIR 4,390
5 Efficient-AI-Backbones 4,005
6 mmpretrain 3,398
7 scenic 3,287
8 towhee 3,185
9 EVA 2,244
10 EasyCV 1,776
11 VRT 1,347
12 InternVideo 1,338
13 thepipe 1,134
14 VoxFormer 1,040
15 ONE-PEACE 946
16 vit-explain 814
17 how-do-vits-work 807
18 DAT 770
19 ImageNet21K 726
20 swin2sr 576
21 parseq 567
22 GCVit 423
23 MPViT 360

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com