Top 23 Python pretrained-model Projects

transformers

175 124,557 10.0 Python

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Project mention: Maxtext: A simple, performant and scalable Jax LLM | news.ycombinator.com | 2024-04-23

Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py

pytorch-image-models

35 29,751 9.4 Python

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Project mention: FLaNK AI Weekly 18 March 2024 | dev.to | 2024-03-18

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
spleeter

230 24,878 1.5 Python

Deezer source separation library including pretrained models.

Project mention: Are stems a good way of making mashups | /r/Beatmatch | 2023-12-10

virtual dj and others stem separator is shrinked model of this https://github.com/deezer/spleeter you will get better results downloading original + their large model.

PaddleNLP

2 11,386 9.8 Python

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Qwen

5 10,893 9.5 Python

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Project mention: What the heck is so great about this model? | /r/SillyTavernAI | 2023-12-07

Qwen: https://github.com/QwenLM/Qwen

segmentation_models.pytorch

14 8,800 2.8 Python

Segmentation models with pretrained backbones. PyTorch.

Project mention: Instance segmentation of small objects in grainy drone imagery | /r/computervision | 2023-12-09

Also, I’d suggest considering switching to the segmentation-models library - it provides U-Net models with a variety of pretrained backbones of as encoders. The author also put out a PyTorch version. https://github.com/qubvel/segmentation_models.pytorch https://github.com/qubvel/segmentation_models

petals

98 8,661 8.5 Python

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Project mention: Mistral Large | news.ycombinator.com | 2024-02-26

So how long until we can do an open source Mistral Large?
We could make a start on Petals or some other open source distributed training network cluster possibly?
[0] https://petals.dev/

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
LMFlow

10 7,975 9.5 Python

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Project mention: Your weekly machine learning digest | /r/learnmachinelearning | 2023-07-03

CodeGeeX

9 7,751 2.0 Python

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

Project mention: For Developers - THUDM/CodeGeeX: CodeGeeX: An Open Multilingual Code Generation Model | /r/OfflineAI | 2023-05-20

EfficientNet-PyTorch

2 7,715 0.0 Python

A PyTorch implementation of EfficientNet and EfficientNetV2 (coming soon!)
mmf

2 5,413 5.5 Python

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
PaddleClas

2 5,251 5.6 Python

A treasure chest for visual classification and recognition powered by PaddlePaddle
CogVLM

16 4,968 9.0 Python

a state-of-the-art-level open visual language model | 多模态预训练模型

Project mention: Mixtral: Mixture of Experts | news.ycombinator.com | 2024-01-08

CogVLM is very good in my (brief) testing: https://github.com/THUDM/CogVLM
The model weights seem to be under a non-commercial license, not true open source, but it is "open access" as you requested.

awesome-pretrained-chinese-nlp-models

1 4,193 8.9 Python

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合
facenet-pytorch

4 4,144 3.8 Python

Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models

Project mention: [D] Fast face recognition over video | /r/MachineLearning | 2023-04-22

Hijacking this comment because i've been working nonstop on my project thanks to your suggestion. I'm now using this https://github.com/derronqi/yolov8-face for face detection and still the old face_recognition for encodings. I'm clustering with dbscan and extracting frames with ffmpeg with -hwaccel on. I'm planning to try this: https://github.com/timesler/facenet-pytorch as it looks like it would be the fastest thing avaiable to process videos? Keep in mind i need to perform encoding other than just detection because i want to use DBscan (and later also facial recognition, but this might be done separately just by saving the encodings). let me know if you have any other suggestions, and thanks again for your help

Efficient-AI-Backbones

3 3,783 4.4 Python

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Chinese-CLIP

1 3,590 7.6 Python

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
PINTO_model_zoo

5 3,288 9.8 Python

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
mmpretrain

2 3,156 7.8 Python

OpenMMLab Pre-training Toolbox and Benchmark
Pretrained-Language-Model

1 2,956 6.1 Python

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Project mention: Does anyone know a downloadable chatgpt model that supports conversation in Albanian? | /r/Programimi | 2023-05-16

deepsparse

21 2,866 9.6 Python

Sparsity-aware deep learning inference runtime for CPUs

Project mention: Fast Llama 2 on CPUs with Sparse Fine-Tuning and DeepSparse | news.ycombinator.com | 2023-11-23

Interesting company. Yannic Kilcher interviewed Nir Shavit last year and they went into some depth: https://www.youtube.com/watch?v=0PAiQ1jTN5k DeepSparse is on GitHub: https://github.com/neuralmagic/deepsparse

OFA

3 2,323 2.8 Python

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
asteroid

2 2,103 5.4 Python

The PyTorch-based audio source separation toolkit for researchers
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python pretrained-models related posts

CogAgent-18B – visual-based GUI Agent capabilities
2 projects | news.ycombinator.com | 16 Dec 2023
Are stems a good way of making mashups
1 project | /r/Beatmatch | 10 Dec 2023
What do you think. When should we expect the next SDXL version?
1 project | /r/StableDiffusion | 10 Dec 2023
Big News!
1 project | /r/OnePieceMangaCut | 9 Dec 2023
Anybody here know what AI model does Steinberg's Spectralayers use to do stem separation?
1 project | /r/audioengineering | 8 Dec 2023
Gemini: Google's most capable AI model yet
2 projects | news.ycombinator.com | 6 Dec 2023
Open-source LLMs with Image Interpretation
1 project | /r/LocalLLaMA | 6 Dec 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 25 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source pretrained-model projects in Python? This list will help you:

	Project	Stars
1	transformers	124,557
2	pytorch-image-models	29,751
3	spleeter	24,878
4	PaddleNLP	11,386
5	Qwen	10,893
6	segmentation_models.pytorch	8,800
7	petals	8,661
8	LMFlow	7,975
9	CodeGeeX	7,751
10	EfficientNet-PyTorch	7,715
11	mmf	5,413
12	PaddleClas	5,251
13	CogVLM	4,968
14	awesome-pretrained-chinese-nlp-models	4,193
15	facenet-pytorch	4,144
16	Efficient-AI-Backbones	3,783
17	Chinese-CLIP	3,590
18	PINTO_model_zoo	3,288
19	mmpretrain	3,156
20	Pretrained-Language-Model	2,956
21	deepsparse	2,866
22	OFA	2,323
23	asteroid	2,103