Top 23 text-to-image Open-Source Projects

DALLE2-pytorch

65 10,826 6.8 Python

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Project mention: One year ago I got access to closed beta DALL-E 2. | /r/singularity | 2023-05-25

I was showing people Dalle2 last year and telling them how much of an impact an open source solution was going to have on, well, everything to do with art and design. (At the time Stable Diffusion had not released, not even the leak, and all hopes was on https://github.com/lucidrains/DALLE2-pytorch)

imagen-pytorch

47 7,787 6.8 Python

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

Project mention: Google's StyleDrop can transfer style from a single image | /r/StableDiffusion | 2023-06-03

If google doesnt, someone like lucidrains probably would implement it, just like he did for imagen and muse.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Dreambooth-Stable-Diffusion

47 7,383 0.0 Jupyter Notebook

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

Project mention: Where can I train my own LoRA? | /r/StableDiffusionInfo | 2023-06-21

DALLE-pytorch

20 5,493 2.5 Python

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Project mention: The Eleuther AI Mafia | news.ycombinator.com | 2023-09-03

It all started originally on lucidrains/dalle-pytorch in the months following the release of DALL-E (1). The group started as `dalle-pytorch-replicate` but was never officially "blessed" by Phil Wang who seems to enjoy being a free agent (can't blame him).
https://github.com/lucidrains/DALLE-pytorch/issues/116 is where the discord got kicked off originally. There's a lot of other interactions between us in the github there. You should be able to find when Phil was approached by Jenia Jitsev, Jan Ebert, and Mehdi Cherti (all starting LAION members) who graciously offered the chance to replicate the DALL-E paper using their available compute at the JUWELS and JUWELS Booster HPC system. This all predates Emad's arrival. I believe he showed up around the time guided diffusion and GLIDE, but it may have been a bit earlier.
Data work originally focused on amassing several of the bigger datasets of the time. Getting CC12M downloaded and trained on was something of an early milestone (robvanvolt's work). A lot of early work was like that though, shuffling through CC12M, COCO, etc. with the dalle-pytorch codebase until we got an avocado armchair.
Christophe Schumann was an early contributor as well and great at organizing and rallying. He focused a lot on the early data scraping work for what would become the "LAION5B" dataset. I don't want to credit him with the coding and I'm ashamed to admit I can't recall who did much of the work there - but a distributed scraping program was developed (the name was something@home... not scraping@home?).
The discord link on Phil Wang's readme at dalle-pytorch got a lot of traffic and a lot of people who wanted to pitch in with the scraping effort.
Eventually a lot of people from Eleuther and many other teams mingled with us, some sort of non-profit org was created in Germany I believe for legal purposes. The dataset continued to grow and the group moved from training DALLE's to finetuning diffusion models.
The `CompVis` team were great inspiration at the time and much of their work on VQGAN and then latent diffusion models basically kept us motivated. As I mentioned a personal motivation was Katherine Crowson's work on a variety of things like CLIP-guided vqgan, diffusion, etc.
I believe Emad Mostaque showed up around the time GLIDE was coming out? I want to say he donated money for scrapers to be run on AWS to speed up data collection. I was largely hands off for much of the data scraping process and mostly enjoyed training new models on data we had.
As with any online community things got pretty ill-defined, roles changed over, volunteers came/went, etc. I would hardly call this definitive and that's at least partially the reason it's hard to trace as an outsider. That much of the early history is scattered about GitHub issues and PR's can't have helped though.

deep-daze

49 4,379 0.0 Python

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
min-dalle

31 3,474 0.0 Python

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch
Awesome-Prompt-Engineering

9 3,212 5.8 Python

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

Project mention: AI lessons | /r/ChatGPT | 2023-05-09

Yes, there are a lot of different resources online, especially for generative AI. The Awesome Prompt Engineering github is probably a good place to start https://github.com/promptslab/Awesome-Prompt-Engineering. If you're focusing directly on OpenAI's models then the OpenAI Prompt Engineering Guide would be my recommendation https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
dalle-playground

35 2,762 3.2 JavaScript

A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)

Project mention: Discord bot with a locally-hosted SD backend. | /r/StableDiffusion | 2023-05-17

Built on dalle-playground because it is simple and I like it.

Kandinsky-2

15 2,695 6.9 Jupyter Notebook

Kandinsky 2 — multilingual text2image latent diffusion model

Project mention: New Kandinsky 2.2 was released. Now with controlnets and code for lora fine-tuning. | /r/StableDiffusion | 2023-07-13

Diffusion-Models-Papers-Survey-Taxonomy

2 2,670 6.9

Diffusion model papers, survey, and taxonomy
VQGAN-CLIP

67 2,563 0.0 Python

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Project mention: 📚 Tutorials & 🎨 AI Art Generation Tool List Mega Thread | /r/AI_Aesthetics | 2023-07-26

VQGAN-CLIP

big-sleep

62 2,559 0.0 Python

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
carefree-creator

14 2,117 9.4 Jupyter Notebook

AI magics meet Infinite draw board.
awesome-generative-ai

5 1,987 9.5 Jupyter Notebook

A curated list of Generative AI tools, works, models, and references (by filipecalegario)

Project mention: Generative AI – A curated list of Generative AI tools, works, models | news.ycombinator.com | 2023-07-14

Awesome-Text-to-Image

1 1,878 9.1

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
ru-dalle

50 1,639 0.0 Jupyter Notebook

Generate images from texts. In Russian
CogView

16 1,602 4.2 Python

Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".
TokenFlow

1 1,462 6.2 Python

Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)

Project mention: TokenFlow has been Released | /r/StableDiffusion | 2023-09-07

Code: https://github.com/omerbt/TokenFlow

Radiata

8 983 8.1 Python

Stable diffusion webui based on diffusers.

Project mention: 🌠🌟Radiata TensorRT WebUI ⚡🏎️💨 | /r/DeepFloydIF | 2023-06-02

text2room

3 972 5.3 Python

Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
CogView2

11 929 0.0 Python

official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"
MultiDiffusion

13 906 4.8 Jupyter Notebook

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Project mention: Opendream: A Non-Destructive UI for Stable Diffusion | news.ycombinator.com | 2023-08-15

For composing this approach works pretty well
https://multidiffusion.github.io/

muse-maskgit-pytorch

5 816 5.6 Python

Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch

Project mention: Google's StyleDrop can transfer style from a single image | /r/StableDiffusion | 2023-06-03

If google doesnt, someone like lucidrains probably would implement it, just like he did for imagen and muse.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

text-to-image related posts

Are you looking for free open source alternative to Midjourney and Bing images?

1 project | news.ycombinator.com | 25 Mar 2024
100% free Midjourney alternative. Plant trees as you generated realistic images

1 project | news.ycombinator.com | 20 Mar 2024
Are you looking for a green yet free Chatgpt4 Alternative?

1 project | news.ycombinator.com | 18 Mar 2024
A green tree planting alternative for chatgpt4, Gemini-pro and Midjourney

1 project | news.ycombinator.com | 14 Mar 2024
Green Chatgt4 Alternative That Plants Trees for Every Chat

1 project | news.ycombinator.com | 14 Mar 2024
Transforming the Future of Generative AI with Open Access and Global Impact

1 project | news.ycombinator.com | 14 Mar 2024
📚 Tutorials & 🎨 AI Art Generation Tool List Mega Thread

1 project | /r/AI_Aesthetics | 26 Jul 2023
A note from our sponsor - SaaSHub
www.saashub.com | 3 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source text-to-image projects? This list will help you:

	Project	Stars
1	DALLE2-pytorch	10,826
2	imagen-pytorch	7,787
3	Dreambooth-Stable-Diffusion	7,383
4	DALLE-pytorch	5,493
5	deep-daze	4,379
6	min-dalle	3,474
7	Awesome-Prompt-Engineering	3,212
8	dalle-playground	2,762
9	Kandinsky-2	2,695
10	Diffusion-Models-Papers-Survey-Taxonomy	2,670
11	VQGAN-CLIP	2,563
12	big-sleep	2,559
13	carefree-creator	2,117
14	awesome-generative-ai	1,987
15	Awesome-Text-to-Image	1,878
16	ru-dalle	1,639
17	CogView	1,602
18	TokenFlow	1,462
19	Radiata	983
20	text2room	972
21	CogView2	929
22	MultiDiffusion	906
23	muse-maskgit-pytorch	816