Dall-E 2

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • dalle-2-preview

  • A few comments by someone who's spent way too much time in the AI-generated space:

    * I recommend reading the System Card that came with it because it's very through: https://github.com/openai/dalle-2-preview/blob/main/system-c...

    * Unlike GPT-3, my read of this announcement is that OpenAI does not intend to commercialize it, and that access to the waitlist is indeed more for testing its limits (and as noted, commercializing it would make it much more likely lead to interesting legal precedent). Per the docs, access is very explicitly limited: (https://github.com/openai/dalle-2-preview/blob/main/system-c... )

    * A few months ago, OpenAI released GLIDE ( https://github.com/openai/glide-text2im ) which uses a similar approach to AI image generation, but suspiciously never received a fun blog post like this one. The reason for that in retrospect may be "because we made it obsolete."

    * The images in the announcement are still cherry-picked, which is therefore a good reason why they tested DALL-E 1 vs. DALL-E 2 presumably on non-cherrypicked images.

    * Cherry-picking is relevant because AI image generation is still slow unless you do real shenanigans that likely compromise image quality, although OpenAI has likely a better infra to handle large models as they have demonstrated with GPT-3.

  • jukebox

    Code for the paper "Jukebox: A Generative Model for Music"

  • That was also my favourite concept, especially with OpenAI Jukebox (https://openai.com/blog/jukebox/). The idea of having new music in the style of your favourite artist is amazing.

    However the fidelity of their music AI kinda sucks at this point, but I'm sure we'll get pitch perfect versions of this concept as the singularity gets closer :)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • sentencepiece

    Unsupervised text tokenizer for Neural Network-based text generation.

  • Haven't read the paper, but they are probably using something like sentencepiece with sub-word splitting and then charge by the number of resulting token.

    https://github.com/google/sentencepiece

  • glide-text2im

    GLIDE: a diffusion-based text-conditional image synthesis model

  • A few comments by someone who's spent way too much time in the AI-generated space:

    * I recommend reading the System Card that came with it because it's very through: https://github.com/openai/dalle-2-preview/blob/main/system-c...

    * Unlike GPT-3, my read of this announcement is that OpenAI does not intend to commercialize it, and that access to the waitlist is indeed more for testing its limits (and as noted, commercializing it would make it much more likely lead to interesting legal precedent). Per the docs, access is very explicitly limited: (https://github.com/openai/dalle-2-preview/blob/main/system-c... )

    * A few months ago, OpenAI released GLIDE ( https://github.com/openai/glide-text2im ) which uses a similar approach to AI image generation, but suspiciously never received a fun blog post like this one. The reason for that in retrospect may be "because we made it obsolete."

    * The images in the announcement are still cherry-picked, which is therefore a good reason why they tested DALL-E 1 vs. DALL-E 2 presumably on non-cherrypicked images.

    * Cherry-picking is relevant because AI image generation is still slow unless you do real shenanigans that likely compromise image quality, although OpenAI has likely a better infra to handle large models as they have demonstrated with GPT-3.

  • bevy_retro

    Plugin pack for making 2D games with Bevy

  • There are programming projects[1] out there that use licenses to prevent people from using projects in ways the authors don't agree with. You could also argue that GPL does the same thing (prevents people from using/distributing the software in the way they would like).

    Whether you consider it moral doesn't seem relevant, only to respect the wishes of the author of such programs.

    [1] https://github.com/katharostech/bevy_retrograde/blob/master/...

  • DALLE-mtf

    Open-AI's DALL-E for large scale training in mesh-tensorflow.

  • stylegan2-pytorch

    Simplest working implementation of Stylegan2, state of the art generative adversarial network, in Pytorch. Enabling everyone to experience disentanglement

  • https://thispersondoesnotexist.com

    I think if I was istockphoto.com I'd be a little worried, but that is microstock photography. I'm not sure that is worth billions. Besides once this tech is wildly available if anything it devalues this sort of thing further closer to $0.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • gpt-2

    Code for the paper "Language Models are Unsupervised Multitask Learners"

  • For discussion's sake:

    - BFN reached out to A16Z, Worldcoin, Khosla Ventures largely declined to comment, which would mean that at least one person probably had a bit of runway from at least when the requests for comment were submitted. So yeah, you're probably right.

    - Going from the github repos for GPT 2 and 3, those may have been hard launches:

    Feb 14 2019, predating the first press for GPT-2 by a few days (was probably made public Feb 14 though) - https://github.com/openai/gpt-2/commit/c2dae27c1029770cea409...

    May 28 2020, timed alongside the press news for GPT-3 - https://github.com/openai/gpt-3/commit/12766ba31aa6de490226e...

    - Would it really have to be a conspiracy? Sounds like only one person would have to target a specific date or date range, and without really giving a reason.

    One of the things that puts a hole in my own thinking here is that Sam Altman's name isn't really tied to the Dall-E 2 release. It's just OpenAI, and all the press around Sam's name today still almost exclusively surfaces Worldcoin stories. So if this was actually intended to bury another story, Sam's name would have to have been included in all the press blasts to be successful.

  • gpt-3

    Discontinued GPT-3: Language Models are Few-Shot Learners

  • For discussion's sake:

    - BFN reached out to A16Z, Worldcoin, Khosla Ventures largely declined to comment, which would mean that at least one person probably had a bit of runway from at least when the requests for comment were submitted. So yeah, you're probably right.

    - Going from the github repos for GPT 2 and 3, those may have been hard launches:

    Feb 14 2019, predating the first press for GPT-2 by a few days (was probably made public Feb 14 though) - https://github.com/openai/gpt-2/commit/c2dae27c1029770cea409...

    May 28 2020, timed alongside the press news for GPT-3 - https://github.com/openai/gpt-3/commit/12766ba31aa6de490226e...

    - Would it really have to be a conspiracy? Sounds like only one person would have to target a specific date or date range, and without really giving a reason.

    One of the things that puts a hole in my own thinking here is that Sam Altman's name isn't really tied to the Dall-E 2 release. It's just OpenAI, and all the press around Sam's name today still almost exclusively surfaces Worldcoin stories. So if this was actually intended to bury another story, Sam's name would have to have been included in all the press blasts to be successful.

  • community-events

    Place where folks can contribute to 🤗 community events

  • If you're interested in generative models, Hugging Face is putting on an event around generative models right now called the HugGAN sprint, where they're giving away free access to compute to train models like this.

    You can join it by following the steps in the guide here: https://github.com/huggingface/community-events/tree/main/hu...

    There will also be talks from awesome folks at EleutherAI, Google, and Deepmind

  • dalle-mini

    DALL·E Mini - Generate images from a text prompt

  • tensorrtx

    Implementation of popular deep learning networks with TensorRT network definition API

  • I'll try them out. I have an RTX 2070, which apparently supports fp16. But it only has 8GB RAM.

    I used the instructions here to check: https://github.com/wang-xinyu/tensorrtx/blob/master/tutorial...

  • lm-human-preferences

    Code for the paper Fine-Tuning Language Models from Human Preferences

  • The kind of measures they are taking, like simply deleting wholesale anything problematic, don't really have a '-1'.

    But amusingly, exactly that did happen in one of their GPT experiments! https://openai.com/blog/fine-tuning-gpt-2/

  • v-diffusion-pytorch

    v objective diffusion inference code for PyTorch.

  • com/RiversHaveWings/status/1462859669454536711, 2021.

    [8] Katherine Crowson. v-diffusion. https://github.com/crowsonkb/v-diffusion-pytorch, 2021.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts