Python pretraining

Open-source Python projects categorized as pretraining

Top 10 Python pretraining Projects

  • LMOps

    General technology for enabling AI capabilities w/ LLMs and MLLMs

  • OFA

    Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • mPLUG-Owl

    mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model

    Project mention: Unleash the Power of Video-LLaMA: Revolutionizing Language Models with Video and Audio Understanding! | dev.to | 2023-06-12

    We extend our deepest gratitude to the extraordinary projects that have influenced and contributed to the development of Video-LLaMA. We're indebted to MiniGPT-4, FastChat, BLIP-2, EVA-CLIP, ImageBind, LLaMA, VideoChat, LLaVA, WebVid, and mPLUG-Owl for their invaluable contributions. Special thanks to Midjourney for creating the stunning Video-LLaMA logo, encapsulating the essence of our groundbreaking project.

  • SparK

    [ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling" (by keyu-tian)

  • ImageNet21K

    Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

  • PITI

    PITI: Pretraining is All You Need for Image-to-Image Translation

  • LinkBERT

    [ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • Revisiting-Contrastive-SSL

    Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]

  • tabular-dl-pretrain-objectives

    Revisiting Pretrarining Objectives for Tabular Deep Learning

  • awesome-clip-papers

    The most impactful papers related to contrastive pretraining for multimodal models!

    Project mention: A History of CLIP Model Training Data Advances | dev.to | 2024-03-13

    For a comprehensive catalog of papers pushing the state of CLIP models forward, check out this Awesome CLIP Papers Github repository. Additionally, the Zero-shot Prediction Plugin for FiftyOne allows you to apply any of the OpenCLIP-compatible models to your own data.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-13.

Python pretraining related posts

Index

What are some of the best open-source pretraining projects in Python? This list will help you:

Project Stars
1 LMOps 3,162
2 OFA 2,318
3 mPLUG-Owl 1,892
4 SparK 1,384
5 ImageNet21K 695
6 PITI 470
7 LinkBERT 389
8 Revisiting-Contrastive-SSL 86
9 tabular-dl-pretrain-objectives 58
10 awesome-clip-papers 10
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com