Vision_transformer Alternatives

Similar projects and alternatives to vision_transformer

Pytorch

336 77,783 10.0 Python vision_transformer VS Pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
pytorch-image-models

35 29,751 9.4 Python vision_transformer VS pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
docarray

32 2,739 9.2 Python vision_transformer VS docarray

Represent, send, store and search multimodal data
typeshed

24 4,066 9.9 Python vision_transformer VS typeshed

Collection of library stubs for Python, with static types
beartype

18 2,420 9.4 Python vision_transformer VS beartype

Unbearably fast near-real-time hybrid runtime-static type-checking in pure Python.
nerfstudio

10 8,488 9.6 Python vision_transformer VS nerfstudio

A collaboration friendly studio for NeRFs
TorchSharp

5 1,235 9.6 C# vision_transformer VS TorchSharp

A .NET library that provides access to the library that powers PyTorch.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ImageNet21K

1 695 10.0 Python vision_transformer VS ImageNet21K

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
Fashion12K_german_queries

1 3 0.0 Python vision_transformer VS Fashion12K_german_queries
fashion-200k

1 60 10.0 vision_transformer VS fashion-200k

Fashion 200K dataset used in paper "Automatic Spatially-aware Fashion Concept Discovery."

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better vision_transformer alternative or higher similarity.

Suggest an alternative to vision_transformer

vision_transformer reviews and mentions

Posts with mentions or reviews of vision_transformer. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-31.

Can I use CLIP to tag my picture collection?
1 project | /r/OpenAI | 10 Jun 2023

And one last thing, should I even be thinking of using CLIP for these tasks when Google has released a better model here: https://github.com/google-research/vision_transformer/blob/main/model_cards/lit.md
When the client's management is happy but their dev team is a pain
8 projects | /r/ProgrammerHumor | 31 Jan 2023

Google's vision transformers are type hinted.
Improving Search Quality for Non-English Queries with Fine-tuned Multilingual CLIP Models
5 projects | dev.to | 22 Dec 2022

We’re going to look at a model that Open AI has trained with a broad multilingual dataset: The xlm-roberta-base-ViT-B-32 CLIP model, which uses the ViT-B/32image encoder, and the XLM-RoBERTa multilingual language model. Both of these are pre-trained:
[R] How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
3 projects | /r/MachineLearning | 20 Jun 2021

JAX Code: https://github.com/google-research/vision_transformer
[D] (Paper Overview) MLP-Mixer: An all-MLP Architecture for Vision
1 project | /r/MachineLearning | 5 May 2021
[P] Animesion: a framework, for anime (and related) character recognition. It uses Vision Transformers trained on a subset of Danbooru2018, that we rebranded as DAF:re, and can classify a given image into one of more than 3000 characters! Source code and checkpoints included.
1 project | /r/MachineLearning | 3 Feb 2021

For this project I used the pretrained models released by Google in Jax, using this particular PyTorch custom implementation. Those were pretrained on ImageNet21k with 14 M images among 21 K classes. Then yes I finetune on two datasets: one with 15 K images and 170 characters, and one with 3 K characters and almost 500 K images.
Short term memory solutions for video tasks?
1 project | /r/deeplearning | 22 Jan 2021
A note from our sponsor - WorkOS
workos.com | 25 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →