CVE-2023-4863: Heap buffer overflow in WebP (Chrome)

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

libwebp

13 1,908 8.7 C

Mirror only. Please do not send pull requests. See https://chromium.googlesource.com/webm/libwebp/+/HEAD/CONTRIBUTING.md.

The original commit in question: https://github.com/webmproject/libwebp/commit/f75dfbf23d1df1...
The commit that fixes this bug: https://github.com/webmproject/libwebp/commit/902bc919033134...
The original commit optimizes a Huffman decoder. The decoder uses a well-known optimization: it reads N bits in advance and determines how many bits have to be actually consumed and which symbol should be decoded, or, if it's an N-bit prefix of multiple symbols, which table should be consulted for remaining bits.
The old version did use lookup tables for short symbols, but longer symbols needed a graph traversal. The new version improved this by using an array of lookup tables. Each entry contains (nbits, value) where `nbits` is # bits to be consumed and `value` is normally a symbol, but if `nbits` exceeds N `value` is interpreted as a table index and `nbits` is reinterpreted as the longest code length in that subtree. So each subsequent table should have `2^(nbits - N)` entries (the root table is always fixed to 2^N entries).
The new version calculated the maximum number of entries based on the number of symbols (kTableSize). Of course, the Huffman tree comes from an untrusted source and you can easily imagine the case where `nbits` is very big. VP8 Lossless specifically allows up to 15 bits, so the largest possible table has 2^N + 2^15 entries when every single LUT is mapped to its own secondary table, and doing this doesn't need that many symbols (you only need 16-N symbols for each table). So if the Huffman tree was crafted in the way that maximizes the number of entries, it will overflow the allocation.
To be fair, I can see why this happened; the Huffman decoding step is one of the most computationally intensive part of many compression format and any small improvement matters. The Huffman decoder optimization described above is well known, but the longer code case is commonly considered less important to optimize because longer code should rarely appear in general. The original commit message refuted this, and was able to be merged.

jxl.js

25 296 0.0 JavaScript

JPEG XL decoder in JavaScript using WebAssembly (WASM)
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
libavif

44 1,370 9.7 C

libavif - Library for encoding and decoding .avif files

It's 2023, surely this is not yet another bug related to memory unsafety that could be avoided if we'd stop writing critical code that deals with extremely complex untrusted input (media codecs) in memory unsafe languages?
Yep, of course it is: https://github.com/webmproject/libwebp/commit/902bc919033134...
I guess libwebp could be excused as it was started when there were no alternatives, but even for new projects today we're still committing the same mistake[1][2][3].
[1] -- https://code.videolan.org/videolan/dav1d
[2] -- https://github.com/AOMediaCodec/libavif
[3] -- https://github.com/AOMediaCodec/libiamf
Yep. Keep writing these in C; surely nothing will go wrong.

libiamf

1 28 8.9 C

Reference Software for IAMF

It's 2023, surely this is not yet another bug related to memory unsafety that could be avoided if we'd stop writing critical code that deals with extremely complex untrusted input (media codecs) in memory unsafe languages?
Yep, of course it is: https://github.com/webmproject/libwebp/commit/902bc919033134...
I guess libwebp could be excused as it was started when there were no alternatives, but even for new projects today we're still committing the same mistake[1][2][3].
[1] -- https://code.videolan.org/videolan/dav1d
[2] -- https://github.com/AOMediaCodec/libavif
[3] -- https://github.com/AOMediaCodec/libiamf
Yep. Keep writing these in C; surely nothing will go wrong.

Electron

236 111,957 9.8 C++

:electron: Build cross-platform desktop apps with JavaScript, HTML, and CSS

It does, see [0]. Fun fact: Signal desktop, which uses Electron under the hood, is running without sandbox on Linux [1][2].
[0] https://github.com/electron/electron/pull/39824
[1] https://github.com/signalapp/Signal-Desktop/issues/5195
[2] https://github.com/signalapp/Signal-Desktop/pull/4381

Signal-Desktop

322 13,999 9.9 TypeScript

A private messenger for Windows, macOS, and Linux.

It does, see [0]. Fun fact: Signal desktop, which uses Electron under the hood, is running without sandbox on Linux [1][2].
[0] https://github.com/electron/electron/pull/39824
[1] https://github.com/signalapp/Signal-Desktop/issues/5195
[2] https://github.com/signalapp/Signal-Desktop/pull/4381

l4v

15 488 9.6 Isabelle

seL4 specification and proofs

You can't really retrofit safety to C. The best that can be achieved is sel4, which while it is written in C has a separate proof of its correctness: https://github.com/seL4/l4v
The proof is much, much more work than the microkernel itself. A proof for something as large as webP might take decades.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
wuffs

80 3,743 9.4 C

Wrangling Untrusted File Formats Safely

I agree that Wuffs [1] would have been a very good alternative! If it can be made more generally. AFAIK Wuffs is still very limited, in particular it never allows dynamic allocation. Many formats, including those supported by Wuffs the library, need dynamic allocation, so Wuffs code has to be glued with unverified non-Wuffs code [2]. This only works with simpler formats.
[1] https://github.com/google/wuffs/blob/main/doc/wuffs-the-lang...
[2] https://github.com/google/wuffs/blob/main/doc/note/memory-sa...

kani

47 1,900 9.5 Rust

Kani Rust Verifier

> those applications need the proof for correctness so that more dangerous code---say, what would need `unsafe` in Rust---can be safely added
There are actually already tools built for this very purpose in Rust (see Kani [1] for instance).
Formal verification has a serious scaling problem, so forming programs in such a way that there are a few performance-critical areas that use unsafe routines seems like the best route. I feel like Rust leans into this paradigm with `unsafe` blocks.
[1] - https://github.com/model-checking/kani

BrowserBoxPro

24 2,602 6.9 JavaScript

Discontinued :cyclone: BrowserBox is Web application virtualization via zero trust remote browser isolation and secure document gateway technology. Embed secure unrestricted webviews on any device in a regular webpage. Multiplayer embeddable browsers, open source! [Moved to: https://github.com/BrowserBox/BrowserBox]

Agree. This is one of the reasons it's better to go with older and more reliable JPEG for viewport streaming. An exploit chain would need to penetrate screen capture images to pass to the client. Browser zero days do occur and this is why it's important to have protection. For added protection consider browser isolation. Check out open source Zero Trust browser isolation at BrowserBox using JPEG (now WebP) now: https://github.com/dosyago/BrowserBoxPro
Technically, we did try using WebP due to its significant bandwidth gains. However, the compute overhead for encoding versus JPEG introduced unacceptable latency into our streaming pipeline, so for now, we're still against it. Security is an additional mark against the newer standard, as good as it is!

image

37 4,505 9.1 Rust

Encoding and decoding images in Rust (by image-rs)

FTR there is a WebP decoder implementation in safe Rust in the image crate: https://github.com/image-rs/image
It used to be quite incomplete for a long time, but work last year has implemented many webp features. Chromium now has a policy of allowing the use of Rust dependencies, so maybe Chromium could start adopting it?

ZLib

49 5,278 8.9 C

A massively spiffy yet delicately unobtrusive compression library.

So the real issue here is that the lack of tree validation before the tree construction, I believe. I'm surprised that this check was not yet implemented (I actually checked libwebp to make sure that I was missing one). Given this blind spot, an automated test based on the domain knowledge is likely useless to catch this bug.
[1] https://github.com/madler/zlib/blob/master/examples/enough.c

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Ouch - simple compression and decompression for your terminal
2 projects | /r/commandline | 7 Jan 2023
Show HN: OS Image processing API running on edge functions using Rust and WASM
3 projects | news.ycombinator.com | 4 Apr 2024
Jpegli: A New JPEG Coding Library
9 projects | news.ycombinator.com | 3 Apr 2024
Transitioning From PyTorch to Burn
5 projects | dev.to | 14 Feb 2024
Building an online image compressor
5 projects | dev.to | 9 Jan 2024

CVE-2023-4863: Heap buffer overflow in WebP (Chrome)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Rust Miscellaneous Compression Image processing isabelle
Post date: 12 Sep 2023

libwebp

jxl.js

InfluxDB

libavif

libiamf

Electron

Signal-Desktop

l4v

WorkOS

wuffs

kani

BrowserBoxPro

image

ZLib

Related posts

CVE-2023-4863: Heap buffer overflow in WebP (Chrome)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Rust Miscellaneous Compression Image processing isabelle Post date: 12 Sep 2023

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Rust Miscellaneous Compression Image processing isabelle
Post date: 12 Sep 2023