Python multimodal-learning

Open-source Python projects categorized as multimodal-learning

Top 9 Python multimodal-learning Projects

multimodal-learning
  1. open_flamingo

    An open-source framework for training large multimodal models.

  2. Judoscale

    Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.

    Judoscale logo
  3. Multimodal-Toolkit

    Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

  4. XPretrain

    Multi-modality pre-training

  5. pykale

    Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!

  6. LViT

    [IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"

  7. ViT-Lens

    [CVPR 2024] ViT-Lens: Towards Omni-modal Representations

  8. UPop

    [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

  9. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  10. valhalla-nmt

    Code repository for CVPR 2022 paper "VALHALLA: Visual Hallucination for Machine Translation"

  11. Coin-CLIP

    Coin-CLIP: fine-tuned with a vast collection of coin images from CLIP using contrastive learning. It enhances feature extraction for coins, boosting image search accuracy. This model merges Visual Transformer (ViT) with CLIP's multimodal learning, optimized for numismatic applications.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python multimodal-learning discussion

Log in or Post with

Python multimodal-learning related posts

  • New Multimodal Model Coin-CLIP for Coin Identification/Recognition

    1 project | /r/Multimodal | 8 Dec 2023
  • Are there any multimodal AI models I can use to provide a paired text *and* image input, to then generate an expanded descriptive text output? [D]

    2 projects | /r/MachineLearning | 5 Jul 2023
  • [D] Multi modal for visual qna based on a given image. Need suggestions.

    1 project | /r/MachineLearning | 2 May 2023
  • Open Flamingo: An open-source framework for training large multimodal models

    1 project | news.ycombinator.com | 30 Mar 2023
  • [D]Are there any good solutions for multimodal classification? Libraries, AutoML tool?

    2 projects | /r/MachineLearning | 31 Mar 2022
  • Classification problem with text and numerical features

    1 project | /r/LanguageTechnology | 15 Apr 2021
  • A note from our sponsor - Judoscale
    judoscale.com | 25 Apr 2025
    Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more →

Index

What are some of the best open-source multimodal-learning projects in Python? This list will help you:

# Project Stars
1 open_flamingo 3,897
2 Multimodal-Toolkit 603
3 XPretrain 491
4 pykale 457
5 LViT 338
6 ViT-Lens 175
7 UPop 101
8 valhalla-nmt 28
9 Coin-CLIP 19

Sponsored
Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com