-
DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Good article, but note that if the hardware supports the division instruction, will be much faster than the described workarounds.
Personally, I recently did what’s written in 2 cases: FP32 division on ARMv7, and FP64 division on GPUs who don’t support that instruction.
For ARM CPUs, not only they have FRECPE, they also have FRECPS for the iteration step. An example there: https://github.com/microsoft/DirectXMath/blob/jan2021/Inc/Di...
For GPUs, Microsoft classified FP64 division as “extended double shader instruction” and the support is optional. However, GPUs are guaranteed to support FP32 division. The result of FP32 division provides an awesome starting point for Newton-Raphson refinement in FP64 precision.
-
JetBrains
Tell us how you use coding tools. You may win a prize! Are you a developer or a data analyst? Share your thoughts about your coding tools in our short survey and get a chance to win prizes!