CogView
storyteller
Our great sponsors
CogView | storyteller | |
---|---|---|
16 | 1 | |
1,593 | 468 | |
1.8% | - | |
4.2 | 5.9 | |
7 months ago | 8 months ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
CogView
-
CogView2 web app is available at site replicate.com
This web app, which is mentioned in the CogView (1) GitHub repo, is/was using a "slightly different" model than the CogView2 GitHub repo.
-
DALL-E 2 alternative: CogView2 checkpoints now available for download: best released text2image model (9b Transformer)
The web app for CogView2 has been available for months, if I am not mistaken. See this GitHub repo for the link.
-
Paper+code "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers", Ding et al 2022
Is the CogView2 demo avaliable here the same as the CogView2 paper released?
-
The official CogView web app might be using a new model. Evidence and link in a comment. Example: "wedding portrait. no watermark. royalty free." (3 images)
On March 16, 2022, the following was added to the CogView GitHub repo: "News! The demo for a better and faster CogView2 (formal version, March 2022) is available! The lastest model also supports English input, but to translate them into Chinese often could be better."
-
6 of an AI's creations for input text description "fox at night. no watermark." The last image is a mind-bender.
Yes, CogView 2 is free to use, and is available as a web app here. CogView 2's image description text needs to be in simplified Chinese; an English-to-simplified Chinese icon appears after typing 9 characters. The styles are a quick way to add text snippets to the image description text; I forgot to mention that I used "HD Photography" style for this post.
-
New Colab notebook "Multi Perceptor VQGAN + CLIP [Public]" from rdurant722. This notebook allows the optional use of a 2nd CLIP model for greater accuracy at the cost of slower processing speed. Link in comment. Example: "Enchanted Forest by James Gurney" at various iterations.
Github https://github.com/THUDM/CogView
-
cat in a hat by an artificial intelligence system that composes an image to match a given text description
I used the web app for the new version 2 of CogView, then upscaled with a web app version of SwinIR, and then cropped with a paint app. Both of these are free.
-
This image was the result of a text-to-image artificial intelligence system that generated an image for my text description request of "Abstract and non-abstract art combining a cat and tendrils of neon light"
This is from the new version of CogView that was released a few weeks ago. There is a link to the new web version about 1/4 of the way down its GitHub page. It is free to use, with no paid option available. Tip: The input needs to be in simplified Chinese. An English-to-simplified Chinese translator icon appears after typing 9 characters.
-
Tip for new CogView version: adding sentence "No watermark." at the end of the text prompt seems to greatly reduce the occurrence of watermarks. See comment for related tip from a developer. Example: "illustration of a happy SpongeBob SquarePants. No watermark."
A method to prevent generating watermark.
-
[P] New version of CogView (text-to-image) is available in online demo
GitHub repo for older version of CogView.
storyteller
What are some alternatives?
SwinIR - SwinIR: Image Restoration Using Swin Transformer (official repository)
ez-text2video - Easily run text-to-video diffusion with customized video length, fps, and dimensions on 4GB video cards or on CPU.
CogView2 - official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"
Sketch-Guided-Stable-Diffusion - Unofficial Implementation of the Google Paper - https://sketch-guided-diffusion.github.io/
DialogRPT - EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"
aphantasia - CLIP + FFT/DWT/RGB = text to image/video
DALLE-mtf - Open-AI's DALL-E for large scale training in mesh-tensorflow.
Gen-L-Video - The official implementation for "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising".
stable-diffusion-docker - Run the official Stable Diffusion releases in a Docker container with txt2img, img2img, depth2img, pix2pix, upscale4x, and inpaint.
Awesome-Video-Diffusion - A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
LLM-groundedDiffusion - LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)
Text-to-Image-Synthesis - Pytorch implementation of Generative Adversarial Text-to-Image Synthesis paper