Our great sponsors
-
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
sloopy colab notebook: https://github.com/IzumiSatoshi/deforumed-walk/blob/main/notebooks/deforum_walk_v1.ipynb
So, I think good starting point is play with diffusers(https://github.com/huggingface/diffusers) on colab. Diffusers is a library that makes it easy to run stable diffusion on code level. In particular, take a closer look at the depth2img pipeline. It is the core of this game. Because colab is available already set up, it saves much of the time needed to build the python environment.
Next, you need to become familiar with the diffusion model. I recommend this huggingface's course(https://github.com/huggingface/diffusion-models-class) because it is very high quality and you will learn while using diffusers. At first glance, it may not seem directly related to this game, but in my case, knowing what is happening in diffusers helped me in many ways: trial and error, inspiration for ideas, etc. I had no knowledge of pytorch (the deep learning library used for diffusers), so I also took this course (https://www.udacity.com/course/deep-learning-pytorch--ud188) which was in the prerequisites for that huggingface's course. It was also very good.
Then I started touching deforum 3D (https://github.com/deforum-art/deforum-stable-diffusion). there is a notebook available that works with colab. This tutorial(https://www.youtube.com/watch?v=7ZuI56TChvg) will also be helpful.
Related posts
- Show HN: Create typed declarative API clients quickly and easily (Python)
- What are LLMs? An intro into AI, models, tokens, parameters, weights, quantization and more
- LTK is a little toolkit for writing UIs in PyScript
- Block* and AgentFormer – Playing with blocks and Transformers (yay)
- Understanding and avoiding visually ambiguous characters in IDs