HIPIFY
HIPIFY | stable_diffusion.openvino | |
---|---|---|
11 | 47 | |
318 | 1,528 | |
- | - | |
0.0 | 0.8 | |
5 months ago | 8 months ago | |
C++ | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
HIPIFY
-
AMD Hip SDK: Making CUDA Applications Run Across Consumer, Pro GPUs and APUs
Right. I can't speak to its correctness/completeness as I've only done a quick installation and smoke test of the ROCm/HIP/MIOpen stack, but there's even a tool that automates the translation [1].
[1] https://github.com/ROCm-Developer-Tools/HIPIFY
- How to run Llama 13B with a 6GB graphics card
-
How Nvidia’s CUDA Monopoly in Machine Learning Is Breaking
From https://news.ycombinator.com/item?id=32904285 re: AMD Rocm, HIPIFY, :
>> ROCm-Developer-Tools/HIPIFY https://github.com/ROCm-Developer-Tools/HIPIFY :
>> hipify-clang is a clang-based tool for translating CUDA sources into HIP sources. It translates CUDA source into an abstract syntax tree, which is traversed by transformation matchers. After applying all the matchers, the output HIP source is produced.
> ROCm-Developer-Tools/HIPIFY https://github.com/ROCm-Developer-Tools/HIPIFY :
>> hipify-clang is a clang-based tool for translating CUDA sources into HIP sources. It translates CUDA source into an abstract syntax tree, which is traversed by transformation matchers. After applying all the matchers, the output HIP source is produced.
> AMD ROcm supports Pytorch, TensorFlow, MlOpen, rocBLAS on NVIDIA and AMD GPUs: https://rocmdocs.amd.com/en/latest/Deep_learning/Deep-learni...
-
Stable Diffusion on AMD RDNA3
> Thus, the idea is that through typically negligible effort porting to HiP, your code becomes vendor-independent.
Here, the big AMD mistake was to rename those function prefixes in the first place. It's a mistake that they could have avoided...
What a lot of SW codebases did to support AMD (see PyTorch code notably): codebase is still CUDA, have the conversion pass to HIP done at build time.
See https://github.com/ROCm-Developer-Tools/HIPIFY/blob/amd-stag... for the Perl script to do it.
Then comes the problem of AMD not supporting ROCm HIP on most of their hardware or user base.
On Windows, the ROCm HIP SDK is private and only available under NDA. This means that while you can use Blender w/ HIP on Windows, the Blender builds that you compile yourself will not be able to use ROCm HIP.
On Linux, the supported GPUs are few and far between, Vega20 onwards are supported today. APUs, RDNA1, and lower end RDNA2 w/o unsupported hacks (6700 XT and below) are excluded.
-
AI Seamless Texture Generator Built-In to Blender
https://rocmdocs.amd.com/en/latest/Deep_learning/Deep-learni...
RadeonOpenCompute/ROCm_Documentation: https://github.com/RadeonOpenCompute/ROCm_Documentation
ROCm-Developer-Tools/HIPIFYhttps://github.com/ROCm-Developer-Tools/HIPIFY :
> hipify-clang is a clang-based tool for translating CUDA sources into HIP sources. It translates CUDA source into an abstract syntax tree, which is traversed by transformation matchers. After applying all the matchers, the output HIP source is produced.
ROCmSoftwarePlatform/gpufort: https://github.com/ROCmSoftwarePlatform/gpufort :
> GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
ROCm-Developer-Tools/HIP https://github.com/ROCm-Developer-Tools/HIP:
> HIP is a C++ Runtime API and Kernel Language that allows developers to create portable applications for AMD and NVIDIA GPUs from single source code. [...] Key features include:
> - HIP is very thin and has little or no performance impact over coding directly in CUDA mode.
> - HIP allows coding in a single-source C++ programming language including features such as templates, C++11 lambdas, classes, namespaces, and more.
> - HIP allows developers to use the "best" development environment and tools on each target platform.
> - The [HIPIFY] tools automatically convert source from CUDA to HIP.
> - * Developers can specialize for the platform (CUDA or AMD) to tune for performance or handle tricky cases.*
-
单位要求五一之后上缴旧电脑,统一换国产新电脑、新系统,由于不兼容windows软件,所以还要装个windows模拟器,导致办公效率倒退10年。主任吐槽说,这不是用落后代替先进么,我心说连他都看出来了。
并且有一个自动转换工具 https://github.com/ROCm-Developer-Tools/HIPIFY https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-porting-guide.html
- Hipify: Convert CUDA to Portable C++ Code
- Hipify: Convert CUDA to Portable Hip C++ Code
-
Deep Learning options on Radeon RX 6800
It might be worth checking out HIPIFY, which lets you automatically convert CUDA code to vendor neutral code that can be run on any GPU. Disclaimer, I have never used it and have no idea how it works.
-
Will NVIDIA's cryptocurrency limiter interfere with nouveau drivers?
CUDA zu AMD HIP conversion: https://github.com/ROCm-Developer-Tools/HIPIFY
stable_diffusion.openvino
- FLaNK Stack 05 Feb 2024
-
Installing A1111 Stable Diffusion Error
it might be the --xformers flag, try getting rid of that since your not using cuda you wouldn't be able to run it with xformers and you could also try --use-cpu all ... you can also check this out .. https://github.com/bes-dev/stable_diffusion.openvino .. it's probably your best option if your using CPU, which if your PC Graphics are using Intel UHD 620 then you don't have a GPU and an optimized CPU inference would be best to run
- 4 Reasons to Switch to Intel Arc GPUs
-
why is SD not actually using the GPU?
SD can be run on a CPU without a GPU. I know for certain it can be done with OpenVINO. In fact, on some i7s, it will run at around 3 seconds per iteration. There was a reddit SD thread a while back saying it can be done with Automatic111. Also, soe recent threads on problems with AMD GPUs suggest Automatic1111 is using the CPU rather than the intended GPU. (Fortuanely, I have a GPU, so I don't have to deal with it myself!)
-
Slow Performance on RX 6800 XT; Am I Doing Something Wrong or is ROCm Just this Slow?
I'm not actually entirely convinced that it's even using the GPU. Radeontop shows 0% utilization while the images are generating. Additionally, the listed iteration speed should be impossibly slow for any GPU; it says 26.58s/it, which is slower than just running on a CPU.
-
How can i fix it?
iGPU's are in short not supported. There's this repo that may or may not help you, but even if it did I wouldn't expect much.
-
Stable Diffusion Web UI for Intel Arc
You can also run it in windows native with openvino, there is a barebones webui for it as well in one of the forks.Requires setting cpu to gpu in one the files. https://github.com/bes-dev/stable_diffusion.openvino
-
Intel Arc A770 is underperforming in Tom's Hardware Review
In https://github.com/bes-dev/stable_diffusion.openvino/blob/master/stable_diffusion_engine.py
-
So a new benchmark was done for Stable Diffusion on GPU's
" We ended up using three different Stable Diffusion projects for our testing, mostly because no single package worked on every GPU. For Nvidia, we opted for Automatic 1111's webui version(opens in new tab). AMD GPUs were tested using Nod.ai's Shark version(opens in new tab), while for Intel's Arc GPUs we used Stable Diffusion OpenVINO(opens in new tab). "
- Anyone here using Mac?
What are some alternatives?
ZLUDA - CUDA on AMD GPUs
stable-diffusion
ROCm - AMD ROCm™ Software - GitHub Home [Moved to: https://github.com/ROCm/ROCm]
InvokeAI - InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.
ncnn - ncnn is a high-performance neural network inference framework optimized for the mobile platform
stable-diffusion
llama-cpp-python - Python bindings for llama.cpp
stable-diffusion-rocm
rocm-build - build scripts for ROCm
diffusionbee-stable-diffusion-ui - Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
stable-diffusion - A latent text-to-image diffusion model