Base64 Encoding, Explained

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • excel_97_egg

    A web port of the magic carpet simulator hidden within Microsoft Excel 97

  • Here is my Base64 encoder shader:

    https://github.com/Rezmason/excel_97_egg/blob/main/glsl/base...

    I got it down to about thirteen lines of GLSL:

    https://github.com/Rezmason/excel_97_egg/blob/main/glsl/base...

    I use it for Cursed Mode of my side project, which renders the WebGL framebuffer to a 640x480 indexed color BMP, about 15 times per second:

    https://rezmason.github.io/excel_97_egg/?cursed=1

  • proposal-arraybuffer-base64

    TC39 proposal for Uint8Array<->base64/hex

  • There's some additional interesting details, and a surprising amount of variation in those details, once you start really digging into things.

    If the length of your input data isn't exactly a multiple of 3 bytes, then encoding it will use either 2 or 3 base64 characters to encode the final 1 or 2 bytes. Since each base64 character is 6 bits, this means you'll be using either 12 or 18 bits to represent 8 or 16 bytes. Which means you have an extra 4 or 2 bits which don't encode anything.

    In the RFC, encoders are required to set those bits to 0, but decoders only "MAY" choose to reject input which does not have those set to 0. In practice, nothing rejects those by default, and as far as I know only Ruby, Rust, and Go allow you to fail on such inputs - Python has a "validate" option, but it doesn't validate those bits.

    The other major difference is in handling of whitespace and other non-base64 characters. A surprising number of implementations, including Python, allow arbitrary characters in the input, and silently ignore them. That's a problem if you get the alphabet wrong - for example, in Python `base64.standard_b64decode(base64.urlsafe_b64encode(b'\xFF\xFE\xFD\xFC'))` will silently give you the wrong output, rather than an error. Ouch!

    Another fun fact is that Ruby's base64 encoder will put linebreaks every 60 characters, which is a wild choice because no standard encoding requires lines that short except PEM, but PEM requires _exactly_ 64 characters per line.

    I have a writeup of some of the differences among programming languages and some JavaScript libraries here [1], because I'm working on getting a better base64 added to JS [2].

    [1] https://gist.github.com/bakkot/16cae276209da91b652c2cb3f612a...

    [2] https://github.com/tc39/proposal-arraybuffer-base64

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • b64fix

    Compute the Base64 fixpoint up to a given precision

  • Cool! This caught my interest so I wrote a little program to compute that fixpoint up to a specified precision: https://github.com/cls/b64fix

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts