Python foundation-models

Open-source Python projects categorized as foundation-models

Top 18 Python foundation-model Projects

  • ColossalAI

    Making large AI models cheaper, faster and more accessible

  • Project mention: FLaNK AI-April 22, 2024 | dev.to | 2024-04-22
  • unilm

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

  • Project mention: The Era of 1-Bit LLMs: Training_Tips, Code And_FAQ [pdf] | news.ycombinator.com | 2024-03-21
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • LLaVA

    [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

  • Project mention: Show HN: I Remade the Fake Google Gemini Demo, Except Using GPT-4 and It's Real | news.ycombinator.com | 2023-12-10

    Update: For anyone else facing the commercial use question on LLaVA - it is licensed under Apache 2.0. Can be used commercially with attribution: https://github.com/haotian-liu/LLaVA/blob/main/LICENSE

  • Otter

    🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

  • Project mention: OpenAI vs Google, Detect ChatGPT Content with 99% accuracy, Navigating AI compute costs | /r/ChatGPT | 2023-06-15

    👀 Video-LLaMA - Empower large language models with video and audio understanding capability. (link) 🦦 Otter - Multi-modal model with improved instruction-following and in-context learning ability. 🔗 Linkly.AI - AI-powered lead analytics and management platform that helps you track, analyze, and streamline your leads in one place. 🎬 Jet Cut Ready - AI plugin for Adobe Premiere Pro that automatically removes silent parts in videos. (link) 💬 HeyGen's ChatGPT Plugin - Convert text into high-quality videos using AI text and video generation.

  • NExT-GPT

    Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

  • Project mention: Show HN: NExT-GPT – First LLM working with multimodal input and output | news.ycombinator.com | 2023-09-21
  • Ask-Anything

    [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

  • EVA

    EVA Series: Visual Representation Fantasies from BAAI (by baaivision)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • chronos-forecasting

    Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

  • Project mention: Financial Market Applications of LLMs | news.ycombinator.com | 2024-04-20

    There were some developments using LLMs in the timeseries domain which caught my attention.

    I toyed with the Chronos forecasting toolkit [1], and the results were predictably off by wild margins [2]

    What really caught my eye though was the "feel" of the predicted timeseries -- this is the first time I've seen synthetic timeseries that look like the real thing. Stock charts have a certain quality to them, once you've been looking at them long enough, you can tell more often than not whether some unlabeled data is a stock price timeseries or not. It seems the chronos LLM was able to pick up on that "nature" of the price movement, and replicate it in its forecasts. Impressive!

    1: https://github.com/amazon-science/chronos-forecasting

    2: https://imgur.com/a/hTRQ38d

  • autodistill

    Images to inference with no labeling (use foundation models to train supervised models).

  • Project mention: Ask HN: Who is hiring? (February 2024) | news.ycombinator.com | 2024-02-01

    Roboflow | Open Source Software Engineer, Web Designer / Developer, and more. | Full-time (Remote, SF, NYC) | https://roboflow.com/careers?ref=whoishiring0224

    Roboflow is the fastest way to use computer vision in production. We help developers give their software the sense of sight. Our end-to-end platform[1] provides tooling for image collection, annotation, dataset exploration and curation, training, and deployment.

    Over 250k engineers (including engineers from 2/3 Fortune 100 companies) build with Roboflow. We now host the largest collection of open source computer vision datasets and pre-trained models[2]. We are pushing forward the CV ecosystem with open source projects like Autodistill[3] and Supervision[4]. And we've built one of the most comprehensive resources for software engineers to learn to use computer vision with our popular blog[5] and YouTube channel[6].

    We have several openings available but are primarily looking for strong technical generalists who want to help us democratize computer vision and like to wear many hats and have an outsized impact. Our engineering culture is built on a foundation of autonomy & we don't consider an engineer fully ramped until they can "choose their own loss function". At Roboflow, engineers aren't just responsible for building things but also for helping us figure out what we should build next. We're builders & problem solvers; not just coders. (For this reason we also especially love hiring past and future founders.)

    We're currently hiring full-stack engineers for our ML and web platform teams, a web developer to bridge our product and marketing teams, several technical roles on the sales & field engineering teams, and our first applied machine learning researcher to help push forward the state of the art in computer vision.

    [1]: https://roboflow.com/?ref=whoishiring0224

    [2]: https://roboflow.com/universe?ref=whoishiring0224

    [3]: https://github.com/autodistill/autodistill

    [4]: https://github.com/roboflow/supervision

    [5]: https://blog.roboflow.com/?ref=whoishiring0224

    [6]: https://www.youtube.com/@Roboflow

  • Emu

    Emu Series: Generative Multimodal Models from BAAI (by baaivision)

  • Project mention: Show HN: Emu2 – A Gemini-like open-source 37B Multimodal Model | news.ycombinator.com | 2023-12-21

    I'm excited to introduce Emu2, the latest generative multimodal model developed by the Beijing Academy of Artificial Intelligence (BAAI). Emu2 is an open-source initiative that reflects BAAI's commitment to fostering open, secure, and responsible AI research. It's designed to enhance AI's proficiency in handling tasks across various modalities with minimal examples and straightforward instructions.

    Emu2 has demonstrated superior performance over other large-scale models like Flamingo-80B in few-shot multimodal understanding tasks. It serves as a versatile base model for developers, providing a flexible platform for crafting specialized multimodal applications.

    Key features of Emu2 include:

    - A more streamlined modeling framework than its predecessor, Emu.

    - A decoder capable of reconstructing images from the encoder's semantic space.

    - An expansion to 37 billion parameters, boosting both capabilities and generalization.

    BAAI has also released fine-tuned versions, Emu2-Chat for visual understanding and Emu2-Gen for visual generation, which stand as some of the most powerful open-source models available today.

    Here are the resources for those interested in exploring or contributing to Emu2:

    - Project: https://baaivision.github.io/emu2/

    - Model: https://huggingface.co/BAAI/Emu2

    - Code: https://github.com/baaivision/Emu/tree/main/Emu2

    - Demo: https://huggingface.co/spaces/BAAI/Emu2

    - Paper: https://arxiv.org/abs/2312.13286

    We're eager to see how the HN community engages with Emu2 and we welcome your feedback to help us improve. Let's collaborate to push the boundaries of multimodal AI!

  • lag-llama

    Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

  • Project mention: Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting | news.ycombinator.com | 2024-02-26
  • InternVideo

    Video Foundation Models & Data for Multimodal Understanding

  • ONE-PEACE

    A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

  • Project mention: A general representation modal across vision, audio, language modalities | news.ycombinator.com | 2023-05-25
  • meerkat

    Creative interactive views of any dataset.

  • MindVideo

    Official code base for MinD-Video

  • Project mention: This research project on reconstructing video stimulus to the brain using an MRI scanner and AI algorithms reminds me of the RDA brain reading technology | /r/Avatar | 2023-06-23
  • fondant

    Production-ready data processing made easy and shareable

  • Project mention: 25 million Creative Commons image dataset released! | /r/StableDiffusion | 2023-10-01

    Github: https://github.com/ml6team/fondant

  • GRID-playground

    Platform for General Robot Intelligence Development

  • Project mention: GRID: General Robot Intelligence Development Platform | news.ycombinator.com | 2023-10-17
  • meta-prompting

    Official implementation of BGPT @ ICLR 2024 paper "Meta Prompting for AI Systems" (https://arxiv.org/abs/2311.11482)

  • Project mention: Meta Prompting for AGI Systems | news.ycombinator.com | 2024-02-29
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python foundation-models related posts

Index

What are some of the best open-source foundation-model projects in Python? This list will help you:

Project Stars
1 ColossalAI 37,836
2 unilm 18,319
3 LLaVA 16,101
4 Otter 3,441
5 NExT-GPT 2,860
6 Ask-Anything 2,663
7 EVA 1,957
8 chronos-forecasting 1,589
9 autodistill 1,529
10 Emu 1,491
11 lag-llama 942
12 InternVideo 909
13 ONE-PEACE 838
14 meerkat 811
15 MindVideo 348
16 fondant 319
17 GRID-playground 243
18 meta-prompting 29

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com