generative-models
instruct-pix2pix
generative-models | instruct-pix2pix | |
---|---|---|
21 | 21 | |
22,508 | 5,989 | |
4.4% | - | |
7.3 | 0.0 | |
28 days ago | 2 months ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
generative-models
-
Creating Videos with Stable Video Diffusion
git clone https://github.com/Stability-AI/generative-models.git && cd generative-models
- Show HN: I have created a free text-to-image website that supports SDXL Turbo
- How To Increase Performance Time on MacOS
-
Introducing Stable Video Diffusion: Stability AI's New AI Research Tool for Image-to-Video Synthesis
Generative Models by Stability AI Github Repository
-
image-to-video tutorial
# clone SD repo !git clone https://github.com/Stability-AI/generative-models.git # cd into working directory # the % sets the pwd globally as usually each command is run in a subshell in google colab %cd /content/generative-models/ # installing dependencies !pip install -r requirements/pt2.txt !pip install . # HACK # I was getting ModuleNotFoundError: No module named 'scripts' # This is what ChatGPT suggested (let me know if there is a better way) file_path = '/content/generative-models/scripts/sampling/simple_video_sample.py' new_text = "import sys\nsys.path.append('/content/generative-models')\n\n" with open(file_path, 'r') as file: original_content = file.read() updated_content = new_text + original_content with open(file_path, 'w') as file: file.write(updated_content) # Need to create a checkpoints/ folder - that is where the system looks for weights import os dir_name = 'checkpoints' if not os.path.exists(dir_name): os.makedirs(dir_name) print(f"Directory '{dir_name}' created") else: print(f"Directory '{dir_name}' already exists") # Download weights into checkpoints/ folder from huggingface_hub import hf_hub_download hf_hub_download(repo_id="stabilityai/stable-video-diffusion-img2vid", filename="svd.safetensors", local_dir="checkpoints", local_dir_use_symlinks=False) # I can't remember if this step is needed but it aims to reduce the memory footprint of pytorch # I kept getting CUDA out of memory # I got these instructions from the out of memory error message os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:512' print(os.environ['PYTORCH_CUDA_ALLOC_CONF']) # Inside of scripts/sampling/simple_video_sample.py you need to make 2 updates 1. input_path (line 26): update to the location of your file (I attached Gdrive so mine was "/content/drive/MyDrive/examples/car.jpeg" 2. decoding_t (line 34): update it to 5. you need to do this for memory preservation (CUDA out of memory). I'm not sure if 5 is the best value but it worked for me # Finally generate the video (output will be in the outputs/ folder) !python scripts/sampling/simple_video_sample.py
-
Stable Video Diffusion
It looks like the huggingface page links their github that seems to have python scripts to run these: https://github.com/Stability-AI/generative-models
- GitHub - Stability-AI/generative-models: Generative Models by Stability AI
-
How does ComfyUI load SDXL 1.0 so VRAM-efficiently? How do I do the same in vanilla python code?
However, when using the example code from HuggingFace or setting up stuff from the StabilityAI/generative-models repo in a jupyter notebook, I end up using 21 GB of VRAM just for running the default pipeline (with no base model output). If I try to run the extra `base.vae.decode(base_latents)` after generation to get unrefined outputs, I get a CUDA out of memory error as it blows past the 24GB of my NVIDIA RTX 3090.
- SDXL 1.0 is out!
-
SDXL 0.9 Anyone having luck NOT centering subjects?
SDXL uses cropping information as part of the conditioning. Images were randomly cropped during training and the coordinates of the crop were included as two integers at the end of the conditioning vector. If you're using ComfyUI you can use the CLIPTextEncodeSDXL node to specify where the upper left corner of the image should appear to be in relation to some hypothetical uncropped image. Here's a figure with examples from the report on SDXL:
instruct-pix2pix
-
Stable Video Diffusion
My guess is you're thinking of InstructPix2Pix[1], with prompts like "make the sky green" or "replace the fruits with cake"
[1] https://github.com/timothybrooks/instruct-pix2pix
-
AI image editors with “text to filter” function?
This comes from https://github.com/timothybrooks/instruct-pix2pix, there is also an extension to use it in Automatic1111 Stable diffusion webui.
- [D] NeRF, LeRF, Prolific Dreamer, Neuralangelo, and a lot of other cool NeRF research
-
Was it SD that had the ability to edit a photo using prompts?
InstructPix2Pix
-
Alternate download location for instruct-pix2pix-00-22000.ckpt?
Is there another place I can download the model? I tried downloading the file using the instructions on this page:
-
Using our photoshop plugin for some cool image editing! :D
It comes from https://github.com/timothybrooks/instruct-pix2pix, you can try it out https://huggingface.co/spaces/timbrooks/instruct-pix2pix
-
instruct pix2pix faces always come out messed up. The rest is really good. Any idea how to fix this?
interesting, I've been running it using this: https://github.com/timothybrooks/instruct-pix2pix/blob/main/LICENSE
-
Everybody is always talking about AGI. I'm more curious about using the tools that we have now.
This is already done and it's already been implemented in the most popular web-ui for stable diffusion too. Granted the results aren't perfect yet.
-
gif2gif: Quick and easy webui extension for dropping animated GIFs into img2img
Select the script, drop in a GIF, use img2img as normal to process it. Supports quick non-ffmpeg interpolation, and works surprisingly well with InstructPix2Pix. Intended to be a fun no-nonsense GIF pipeline.
-
NMKD Stable Diffusion GUI 1.9.0 is out now, featuring InstructPix2Pix - Edit images simply by using instructions! Link and details in comments.
Github Issue - Closed
What are some alternatives?
background-removal-js - Remove backgrounds from images directly in the browser environment with ease and no additional costs or privacy concerns. Explore an interactive demo.
stable-diffusion-webui - Stable Diffusion web UI
wizmap - Explore and interpret large embeddings in your browser with interactive visualization! 📍
stable-diffusion-webui-instruct-pix2pix - Extension for webui to run instruct-pix2pix
evernote-ai-chatbot
GFPGAN - GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
gping - Ping, but with a graph
gif2gif - Automatic1111 Animated Image (input/output) Extension
graphic-walker - An open source alternative to Tableau. Embeddable visual analytic
k-diffusion - Karras et al. (2022) diffusion models for PyTorch
xgen - Salesforce open-source LLMs with 8k sequence length.
prolificdreamer - Official code of ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight)