Our great sponsors
-
WebKit
Home of the WebKit project, the browser engine used by Safari, Mail, App Store and many other applications on macOS, iOS and Linux.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
v8 does not use NaN-boxes; instead they use the low bit to distinguish between a 31-bit small integer ("smi") or a pointer. Doubles are additionally sometimes stored inline ("double field unboxing") I'm not sure how this works exactly. Other times they are heap-allocated. I am not sure if there is a specialized double-allocator, I'd like to know.
JavaScriptCore uses a tweaked NaN-box [1]: values are stored via a NaN box minus a constant, which avoids requiring a mask when chasing pointers. This makes pointers cheaper but floating point operations more expensive.
SpiderMonkey and Hermes both use straight NaN-boxing to my knowledge.
https://github.com/WebKit/WebKit/blob/ec6b5337e777f9b460ec6b...
> memcpy from bytes to a NaN should work fine
Signaling NaNs are explicitly undefined in C11 F.2.1.: "This specification does not define the behavior of signaling NaNs." - and in practice may be "quieted" by conversion to Quiet NaNs, changing their bit patterns. Fast math optimization flags will also break the hell out of your code by assuming NaNs are impossible. I want to say there are more circumstances where optimizers and compiler generated code can butcher your NaN payloads, but I'd be working off recollected hearsay and I can't find a source, so don't quote me on that.
NaN boxing is common enough that, if you take the right precautions, a modern compiler should probably support it, maybe. NaN boxing is uncommon enough that, if your codebase needs to be sufficiently portable, you need an opt out for when it breaks. Let's review duktape's scars:
https://github.com/svaarala/duktape/blob/123d9426d5e5b36d5da...
https://github.com/svaarala/duktape/blob/5252b7a50611a3cb8bf...
https://github.com/svaarala/duktape/blob/224a0b89ca08a36e37e...
Note that "the right precautions" involve unions and proper integer types to avoid optimizer-invoked rewrites of the value and debugging when things go wrong, not simply YOLOing bytes into a double via memcpy. Note that debugging when it all goes terribly wrong can be quite painful. I've personally had the misfortune of being forced to debug duktape being built with fast math optimizatoins enabled on one "rare" platform + build configuration that wasn't caught by duktape's #if defined(__FAST_MATH__) checks linked above (wasn't Clang nor GCC, so go figure it didn't make the same #define)
> Well half of NaNs are quiet, so that's easy to deal with
Which half varies by architecture (I'm looking at you, MIPS - and apparently RISC-V at one point was going to go the MIPS route with an all-1s payload for canonical qnans?) - so platform specific spaghet and requirements testing is in the mix.
> And the fast optimization flags are themselves violating the standard so those don't count.
Some jerk will enable them, standards-violating or not. While it's valid for the solution to disable the optimization, in practice you will need to debug and write defensive code when this happens. I know of at least one platform which enables such optimizations by default for it's "release" builds, and while I'm angry at them for doing so, I'm unfortunately relegated to existing in the same reality as them.
Perhaps you're lucky enough to exist in a different reality?
> I suggested memcpy because it's the safe way.
More generally, yes, but in the specific context of preserving a NaN payload I would not trust the optimizer to keep my NaN payloads untouched when stored as a floating point value. LLVM developers appear to agree that NaN payload preservation is not guaranteed - I guess you can quote me on my earlier "optimizers can butcher your NaN payloads", presumably even without fast math optimizations:
https://lists.llvm.org/pipermail/llvm-dev/2018-November/1276...
Which causes a good bit of awkwardness for Rust:
https://github.com/rust-lang/rust/issues/73328
The solution is to not attempt to store payload-laden NaN as any kind of floating point value, even via memcpy. A union is acceptable in the sense that at least, then, you're supposedly storing an integer, and the bit pattern of that would be preserved. A memcpy to a temporary float immediately before NaN testing / floating point usage - and never back to integer in a naieve attempt to extra the possibly discarded payload - would work, but is a hell of a caveat to omit when saying "memcpy from bytes to a NaN should work fine", especially when mentioning `nan` with it's payload argument, which is unextractable without doing the naieve "back to integer" extraction which, if the above is to be believed, is unreliable at best.