clspv
objc4
Our great sponsors
clspv | objc4 | |
---|---|---|
8 | 2 | |
574 | 6 | |
2.3% | - | |
9.0 | 0.0 | |
8 days ago | almost 3 years ago | |
LLVM | Objective-C++ | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
clspv
-
Vcc – The Vulkan Clang Compiler
See https://github.com/google/clspv for an OpenCL implementation on Vulkan Compute. There are plenty of quirks involved because the two standards use different varieties of SPIR-V ("kernels" vs. "shaders") and provide different guarantees (Vulkan Compute doesn't care much about numerical accuracy). The Mesa folks are also looking into this as part of their RustiCL (a modern OpenCL implementation) and Zink (implementing OpenGL and perhaps OpenCL itself on Vulkan) projects.
-
AMD's CDNA 3 Compute Architecture
Vulkan Compute backends for numerical compute (as typified by both OpenCL and SYCL) are challenging, you can look at Google's cspv https://github.com/google/clspv project for the nitty gritty details. The lowest-effort path is actually via some combination of Rocm (for hardware that AMD bothers to support themselves) and the Mesa project's Rusticl backend (for everything else).
-
WSL with CUDA Support
D3D12 has more compute features than Vulkan has. It works out for DXVK because games often don’t use those, but it’ll cause much more issues with CLon12.
By the way, if you are ready to have a _limited_ implementation without a full feature set because of Vulkan API limitations, clvk is a thing. The list of limitations of that approach is at https://github.com/google/clspv/blob/master/docs/OpenCLCOnVu...
tldr: Vulkan and OpenCL SPIR-V dialects are different, and the former has significant limitations affecting this use case
- Resources for Vulkan GPGPU searched
-
Low overhead C++ interface for Apple's Metal API
For OpenCL on DX12, the test suite doesn't pass yet. Every Khronos OpenCL 1.2 CTS test passes on at least one hardware driver, but there's none that pass them all. That is why CLon12 isn't submitted to Khronos's compliant products list yet.
The pointer semantics that Vulkan has aren't very amenable to implementing a compliant OpenCL implementation on top of. There are also some other limitatons: https://github.com/google/clspv/blob/master/docs/OpenCLCOnVu....
-
[Hardware Unboxed] - Apple M1 Pro Review - Is It Really Faster than Intel/AMD?
Vulkan is much more limited, notably because of Vulkan's SPIR-V dialect limitations. That makes a compliant OpenCL 1.2 impl on top of Vulkan impossible. (see: https://github.com/google/clspv/blob/master/docs/OpenCLCOnVulkan.md)
-
Cross Platform GPU-Capable Framework?
OpenCL really is your best bet for a cross-platform GPU-capable framework. OpenCL 3.0 cleared out a lot of the cruft from OpenCL 2.x so it's seeing a lot more adoption. The most cross-platform solution is still OpenCL 1.2, largely for MacOS, but OpenCL 3.0 is becoming more and more common for Windows and Linux and multiple devices. Even on platforms without native OpenCL support there are compatibility layers that implement OpenCL on top of DirectX (OpenCLOn12) or Vulkan (clvk and clspv).
objc4
-
Low overhead C++ interface for Apple's Metal API
> The specific method that's causing me trouble right at the moment is "computeCommandEncoder"
Yeah, it looks like there's no way to avoid autorelease here.
> It looks like objc_retainAutoreleasedReturnValue might be exactly what I'm looking for, even if it isn't 100% guaranteed; if I'm understanding, it would be safe, and wouldn't actually leak as long as you had an autorelease pool somewhere in the chain.
Indeed it would be safe and wouldn't leak, but the optimization is very much not guaranteed. It's based on the autorelease implementation manually reading the instruction at its return address to see if it's about to call objc_retainAutoreleaseReturnValue. See the description here:
https://github.com/apple-opensource/objc4/blob/a367941bce42b...
In fact – I did not know this before just now – on every arch other than x86-64 it requires a magic assembly sequence to be placed between the call to an autoreleasing method and the call to objc_retainAutoreleaseReturnValue.
It looks like swiftc implements this by just emitting LLVM inline asm blocks:
%6 = call %1* bitcast (void ()* @objc_msgSend to %1* (i8*, i8*, %0*)*)(i8* %5, i8* %3, %0* %4) #4
What are some alternatives?
OpenCLOn12 - The OpenCL-on-D3D12 mapping layer
metal-cpp - Metal-cpp is a low-overhead C++ interface for Metal that helps developers add Metal functionality to graphics apps, games, and game engines that are written in C++.
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
MoltenVK - MoltenVK is a Vulkan Portability implementation. It layers a subset of the high-performance, industry-standard Vulkan graphics and compute API over Apple's Metal graphics framework, enabling Vulkan applications to run on macOS, iOS and tvOS.
GLSL - GLSL Shading Language Issue Tracker
metal-rs - Rust bindings for Metal
alpaka - Abstraction Library for Parallel Kernel Acceleration :llama:
wgpu - Cross-platform, safe, pure-rust graphics api.
SPIRV-VM - Virtual machine for executing SPIR-V
clvk - Implementation of OpenCL 3.0 on Vulkan