Nan Boxing

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • WebKit

    Home of the WebKit project, the browser engine used by Safari, Mail, App Store and many other applications on macOS, iOS and Linux.

  • v8 does not use NaN-boxes; instead they use the low bit to distinguish between a 31-bit small integer ("smi") or a pointer. Doubles are additionally sometimes stored inline ("double field unboxing") I'm not sure how this works exactly. Other times they are heap-allocated. I am not sure if there is a specialized double-allocator, I'd like to know.

    JavaScriptCore uses a tweaked NaN-box [1]: values are stored via a NaN box minus a constant, which avoids requiring a mask when chasing pointers. This makes pointers cheaper but floating point operations more expensive.

    SpiderMonkey and Hermes both use straight NaN-boxing to my knowledge.

    https://github.com/WebKit/WebKit/blob/ec6b5337e777f9b460ec6b...

  • Duktape

    Duktape - embeddable Javascript engine with a focus on portability and compact footprint

  • > memcpy from bytes to a NaN should work fine

    Signaling NaNs are explicitly undefined in C11 F.2.1.: "This specification does not define the behavior of signaling NaNs." - and in practice may be "quieted" by conversion to Quiet NaNs, changing their bit patterns. Fast math optimization flags will also break the hell out of your code by assuming NaNs are impossible. I want to say there are more circumstances where optimizers and compiler generated code can butcher your NaN payloads, but I'd be working off recollected hearsay and I can't find a source, so don't quote me on that.

    NaN boxing is common enough that, if you take the right precautions, a modern compiler should probably support it, maybe. NaN boxing is uncommon enough that, if your codebase needs to be sufficiently portable, you need an opt out for when it breaks. Let's review duktape's scars:

    https://github.com/svaarala/duktape/blob/123d9426d5e5b36d5da...

    https://github.com/svaarala/duktape/blob/5252b7a50611a3cb8bf...

    https://github.com/svaarala/duktape/blob/224a0b89ca08a36e37e...

    Note that "the right precautions" involve unions and proper integer types to avoid optimizer-invoked rewrites of the value and debugging when things go wrong, not simply YOLOing bytes into a double via memcpy. Note that debugging when it all goes terribly wrong can be quite painful. I've personally had the misfortune of being forced to debug duktape being built with fast math optimizatoins enabled on one "rare" platform + build configuration that wasn't caught by duktape's #if defined(__FAST_MATH__) checks linked above (wasn't Clang nor GCC, so go figure it didn't make the same #define)

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • rust

    Empowering everyone to build reliable and efficient software.

  • > Well half of NaNs are quiet, so that's easy to deal with

    Which half varies by architecture (I'm looking at you, MIPS - and apparently RISC-V at one point was going to go the MIPS route with an all-1s payload for canonical qnans?) - so platform specific spaghet and requirements testing is in the mix.

    > And the fast optimization flags are themselves violating the standard so those don't count.

    Some jerk will enable them, standards-violating or not. While it's valid for the solution to disable the optimization, in practice you will need to debug and write defensive code when this happens. I know of at least one platform which enables such optimizations by default for it's "release" builds, and while I'm angry at them for doing so, I'm unfortunately relegated to existing in the same reality as them.

    Perhaps you're lucky enough to exist in a different reality?

    > I suggested memcpy because it's the safe way.

    More generally, yes, but in the specific context of preserving a NaN payload I would not trust the optimizer to keep my NaN payloads untouched when stored as a floating point value. LLVM developers appear to agree that NaN payload preservation is not guaranteed - I guess you can quote me on my earlier "optimizers can butcher your NaN payloads", presumably even without fast math optimizations:

    https://lists.llvm.org/pipermail/llvm-dev/2018-November/1276...

    Which causes a good bit of awkwardness for Rust:

    https://github.com/rust-lang/rust/issues/73328

    The solution is to not attempt to store payload-laden NaN as any kind of floating point value, even via memcpy. A union is acceptable in the sense that at least, then, you're supposedly storing an integer, and the bit pattern of that would be preserved. A memcpy to a temporary float immediately before NaN testing / floating point usage - and never back to integer in a naieve attempt to extra the possibly discarded payload - would work, but is a hell of a caveat to omit when saying "memcpy from bytes to a NaN should work fine", especially when mentioning `nan` with it's payload argument, which is unextractable without doing the naieve "back to integer" extraction which, if the above is to be believed, is unreliable at best.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts