AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • ZLUDA

    CUDA on AMD GPUs

  • From the same repo, I found this excellent, well-written architecture document: https://github.com/vosen/ZLUDA/blob/master/ARCHITECTURE.md

    I love the direct, "no bullshit" style of writing.

    Some gems:

    > Anyone familiar with C++ will instantly understand that compiling it is a complicated affair.

    > Additionally CUDA allows, to a large degree, mixing CPU code and GPU code. What does all this complexity mean for ZLUDA? Absolutely nothing

    > Since an application can dynamically link to either Driver API or Runtime API, it would seem that ZLUDA needs to provide both. In reality very few applications dynamically link to Runtime API. For the vast majority of applications it's sufficient to provide Driver API for dynamic (runtime) linking.

  • InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.

  • I actually used the rocm/pytorch image you also linked.

    I'm not sure what you're pointing to with your reference to the Fedora-based images. I'm quite happy with my NixOS install and really don't want to switch to anything else. And as long as I have the correct kernel module, my host OS really shouldn't matter to run any of the images.

    And I'm sure it can be made to work with many base images, my point was just that the dependency management around pytorch was in a bad state, where it is extremely easy to break.

    > Anyways, hopefully this PR fixes the immediate issue: https://github.com/invoke-ai/InvokeAI/pull/5714/files

    It does! At least for me. It is my PR after all ;)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ROCm

    Discontinued ROCm Website [Moved to: https://github.com/ROCm/ROCm.github.io] (by ROCm)

  • ROCm is not spelled out anywhere in their documentation and the best answers in search come from Github and not AMD official documents

    "Radeon Open Compute Platform"

    https://github.com/ROCm/ROCm/issues/1628

    And they wonder why they are losing. Branding absolutely matters.

  • Deepspeed-Windows

    Deepspeed windows information

  • I just went through this this weekend - If you're running in Windows and want to use deepspeed, you have to still use Cuda 12.1 because deepspeed 13.1 is the latest that works with 12.1. There's no deepspeed for windows that works with 12.3.

    I tried to get it working this weekend but it was a huge PITA so I switched to putting everything into WSL2 then in arch on there pytorch etc in containers so I could flip versions easily.

    I'm still working on that part, halfway into it my WSL2 completely broke and I had to reinstall windows. The p9 networking stopped working.

    https://github.com/S95Sedan/Deepspeed-Windows

  • chipStar

    chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.

  • There is already a work-in-progress implementation of HIP on top of OpenCL https://github.com/CHIP-SPV/chipStar and the Mesa RustiCL folks are quite interested in getting that to run on top of Vulkan.

  • stable-diffusion-webui

    Stable Diffusion web UI

  • I would love to be able to have a native stable diffusion experience, my rx 580 takes 30s to generate a single image. But it does work after following https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki...

    I got this up and running on my windows machine in short order and I don't even know what stable diffusion is.

    But again, it would be nice to have first class support to locally participate in the fun.

  • Cgml

    GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.

  • I did a few times with Direct3D 11 compute shaders. Here’s an open-source example: https://github.com/Const-me/Cgml

    Pretty sure Vulkan gonna work equally well, at the very least there’s an open source DXVK project which implements D3D11 on top of Vulkan.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ncnn

    ncnn is a high-performance neural network inference framework optimized for the mobile platform

  • ncnn uses Vulkan for GPU acceleration, I've seen it used in a few projects to get AMD hardware support.

    https://github.com/Tencent/ncnn

  • hipDNN

    A thin wrapper around miOpen and cuDNN

  • ROCm-docker

    Dockerfiles for the various software layers defined in the ROCm software platform

  • https://rocm.docs.amd.com/projects/install-on-linux/en/lates... links to ROCm/ROCm-docker: https://github.com/ROCm/ROCm-docker which is the source of docker.io/rocm/rocm-terminal: https://hub.docker.com/r/rocm/rocm-terminal :

      docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/rocm-terminal

  • bazzite

    Bazzite is a custom image built upon Fedora Atomic Desktops that brings the best of Linux gaming to all of your devices - including your favorite handheld.

  • https://github.com/ublue-os/bazzite/blob/main/Containerfile#... has, in addition to fan and power controls, automatic updates on desktop, supergfxctl, system76-scheduler, and an fsync kernel:

      rpm-ostree install rocm-hip \

  • config

    A layer to provide configuration files (udev rules, service units, etc) (by ublue-os)

  • https://github.com/ublue-os/config/blob/main/build/ublue-os-...

    There's a default `distrobox` with pytorch in ublue-os/config//build/ublue-os-just/etc-distrobox/apps.ini:

  • pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration (by ROCm)

  • https://github.com/ROCm/pytorch/blob/main/Dockerfile

    The ublue-os (and so also bazzite) OCI host image Containerfile has Sunshine installed; which is a 4k HDR 120fps remote desktop solution for gaming.

    ublue-os/config//Containerfile:

  • Sunshine

    Self-hosted game stream host for Moonlight.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts