Our great sponsors
-
Bolt
Bolt is a C++ template library optimized for GPUs. Bolt provides high-performance library implementations for common algorithms such as scan, reduce, transform, and sort. (by HSA-Libraries)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
OpenCL had a bit of a "second-mover curse" where instead of trying to solve one problem (GPGPU acceleration) it tried to solve everything (a generalized framework for heterogeneous dispatch) and it just kinda sucks to actually use. It's not that it's slower or faster, in principle it should be the same speed when dispatched to the hardware (+/- any C/C++ optimization gotchas of course), but it just requires an obscene amount of boilerplate to "draw the first triangle" (or, launch the first kernel), much like Vulkan.
HIP was supposed to rectify this, but now you're buying into AMD's custom language and its limitations... and there are limitations, things that CUDA can do that HIP can't (texture unit access was an early one - and texture units aren't just for texturing, they're for coalescing all kinds of 2d/3d/higher-dimensional memory access). And AMD has a history of abandoning these projects after a couple years and leaving them behind and unsupported... like their Thrust framework counterpart, Bolt, which hasn't been updated in 8 years now.
https://github.com/HSA-Libraries/Bolt
The old bit about "Vendor B" leaving behind a "trail of projects designed to pad resumes and show progress to middle managers" still reigns absolutely true with AMD. AMD has a big uphill climb in general to shake this reputation about being completely unserious with their software... and I'm not even talking about drivers here.
http://richg42.blogspot.com/2014/05/the-truth-on-opengl-driv...
> AMD doesn't have a library of warp-level/kernel-level/global "software primitives" like Cuda Unbound or Thrust either.
The ROCm software primatives library is rocPRIM and the ROCm equivalent to Thrust is rocThrust.
https://github.com/ROCmSoftwarePlatform/rocPRIM