Wrangling Untrusted File Formats Safely
Program and data aren't really different, philosophically. On some level this even applies to people. When someone teaches you French is that program or data? Is it just data? Why can you now understand French then? Or if it's program, how does that work, who taught the teacher how to program you?
So, our best effort is to constrain what certain data can do when we process it, in the hope that this prevents surprising negative consequences like a PDF that steals privileged information and sends it elsewhere.
Notice that, in some sense, a PDF which just contains a photograph of your wife tied to a chair and holding today's newspaper, plus human readable text like, "We have your wife Sarah and all three kids Beth, Jim and Amanda. We are watching. Do not try to call for help. Email the privileged information to [email protected] or we will kill your family" is also potentially effective at doing this, but we would not usually consider that an exploit in this context.
One irritation in this space is that programmers love General Purpose Programming Languages. The idea of the general purpose language is that it can do anything. But the problem in this sort of situation is that we don't want programs which can do anything, in fact doing anything is our worst case scenario. We actually want Special Purpose Programming Languages. We want to write our PDF data processing software in a language that even if we were trying can't do the things that should never happen as a result of processing a PDF.
This is the purpose of languages like WUFFS: https://github.com/google/wuffs
You can't write a WUFFS program to, for example, email anything to [email protected] even if you desperately needed to, which means you definitely won't accidentally write a program which can email the privileged information to the crooks when fed a PDF. Of course the PDF mentioned earlier with the kidnap note inside it could still work. And also of course making a PDF renderer out of WUFFS would be a really big ask. WUFFS-the-library today can render PNG, GIF, BMP but notably not yet JPEG. But it's clearly possible for something like PDF rendering to happen under these constraints. Nobody ordinarily viewing a PDF wants it to do arbitrary stuff.
Truly a developer’s best friend. Scout APM is great for developers who want to find and fix performance issues in their applications. With Scout, we'll take care of the bugs so you can focus on building great things 🚀.
I just figured out how do have (generics const limited) dependable types in Rust
1 project | reddit.com/r/rust | 1 Oct 2022
wuffs/doc at main · google/wuffs
1 project | reddit.com/r/YourselfYou | 20 Aug 2022
Ivy: Rob Pike's APL-Like Language / Desk Calculator
3 projects | news.ycombinator.com | 7 Aug 2022
Memory Safety for the World’s Largest Software Project
1 project | reddit.com/r/rust | 24 Jun 2022
About compile time overflow prevention...
1 project | reddit.com/r/ProgrammingLanguages | 19 Jun 2022