-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I tried to create a ML framework[0] that would work on both CUDA and OpenCL (and natively on the CPU) around 2015/2016, which included creating FFI wrappers for both CUDA and OpenCL. This is where my experience on the subject (and my contempt for NVIDIA) comes from.
Me memory isn't perfect, but IIRC the situation was roughly the following: We were quite short on resources (both devtime and money), which meant that we had to choose our scope wisely. Optimally we would have implemented both CUDA and OpenCL 2.0, but we had to settle for OpenCL 1.2 (which offered reduced performance, but was "good enough" for inference). IIRC OpenCL 2.0 was very very similar in what capabilities it assumed and offered to the CUDA version at the time, and cards like the GTX Titan X had "compute capabilities" that supported features like shared virtual memory between CPU and GPU in CUDA at the time. In fact the advances around memory management (and async copying) that were present in CUDA and not in OpenCL 1.x were the main source for the performance differences between the two.
From everything that I can tell at that point in time, if NVIDIA would have wanted to support OpenCL 2.0 they could have done so based on technical requirements. What the reason for not doing so is, is just pure speculation (lack of internal resources due to focusing on devtools?), but to me it always looked like they were using the edge they got via their proprietary libraries like cuDNN to get a foot into the field of ML and then purposefully neglected OpenCL to prevent any competitors from catching up. Classic Embrace, Extend, Extinguish.
[0]: https://github.com/autumnai/leaf