The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more â
Top 7 Python multimodal-learning Projects
-
Multimodal-Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
pykale
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the ð¥PyTorch ecosystem. â Star to support our work!
-
UPop
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
-
valhalla-nmt
Code repository for CVPR 2022 paper "VALHALLA: Visual Hallucination for Machine Translation"
-
Coin-CLIP
Coin-CLIP: fine-tuned with a vast collection of coin images from CLIP using contrastive learning. It enhances feature extraction for coins, boosting image search accuracy. This model merges Visual Transformer (ViT) with CLIP's multimodal learning, optimized for numismatic applications.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Are there any multimodal AI models I can use to provide a paired text *and* image input, to then generate an expanded descriptive text output? [D] | /r/MachineLearning | 2023-07-05Maybe the recent OpenFlamingo gives you better results (they have a demo on HF).
It was created by curating 3.8 million high-resolution videos from the publicly available HD-VILA-100M dataset. The dataset is created following these three steps:
Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31
Project mention: New Multimodal Model Coin-CLIP for Coin Identification/Recognition | /r/Multimodal | 2023-12-08
Python multimodal-learning related posts
- New Multimodal Model Coin-CLIP for Coin Identification/Recognition
- Are there any multimodal AI models I can use to provide a paired text *and* image input, to then generate an expanded descriptive text output? [D]
- [D] Multi modal for visual qna based on a given image. Need suggestions.
- Open Flamingo: An open-source framework for training large multimodal models
- [D]Are there any good solutions for multimodal classification? Libraries, AutoML tool?
- Classification problem with text and numerical features
-
A note from our sponsor - WorkOS
workos.com | 29 Apr 2024
Index
What are some of the best open-source multimodal-learning projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | open_flamingo | 3,459 |
2 | Multimodal-Toolkit | 553 |
3 | XPretrain | 436 |
4 | pykale | 427 |
5 | UPop | 83 |
6 | valhalla-nmt | 26 |
7 | Coin-CLIP | 9 |
Sponsored