Crawling@Home: Help Build The Worlds Largest Image-Text Pair Dataset!

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

DALLE-pytorch

20 5,492 2.5 Python

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Since then, several efforts have been organized to replicate DALL-E. People organized initially around this awesome dalle replication repository https://github.com/lucidrains/DALLE-pytorch with some nice results that can be seen in the readme. More recently as part of an huggingface events, new results have been achieved (see https://wandb.ai/dalle-mini/dalle-mini/reports/DALL-E-mini--Vmlldzo4NjIxODA ) and an online demo is now available https://huggingface.co/spaces/flax-community/dalle-mini

crawlingathome-worker

2 10 0.0 Python
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
DALLE-datasets

1 125 1.8 Python

This is a summary of easily available datasets for generalized DALLE-pytorch training.

A large part of the results that can be achieved with such models is thanks to data. Large amount of data. Today the largest open dataset for (image, text) pairs are in the order of 10M (see https://github.com/robvanvolt/DALLE-datasets ), which is enough to train okay models, but not enough to reach the best performance. Having a public dataset with hundred of millions of pairs could help a lot to build these image+text models.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project