A 17-line C program freezes the Mac kernel (2018)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • m1-panic

    Minimum requirements for triggering an M1 CPU panic

  • Coincidentally, I also stumbled upon a way to make the kernel of Apple Silicon Macs panic and restart while developing the https://lowtechguys.com website.

    I distilled the problem in a repo so it can be reproduced with a single command: https://github.com/alin23/m1-panic

    I found it while on Monterey and reported it 2 times through Feedback Assistant, but it still happens on Ventura.

    NOTE: Don't try it without saving all your work, it has a very high chance of restarting your computer forcefully.

  • htop

    Discontinued htop - an interactive process viewer (by hishamhm)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • xnu

  • The bug is in one of these two lines:

    https://github.com/apple-oss-distributions/xnu/blob/5c2921b0...

    https://github.com/apple-oss-distributions/xnu/blob/5c2921b0...

    This uses the macro SLIST_REMOVE (https://github.com/apple-oss-distributions/xnu/blob/5c2921b0...) to remove an item from the linked list. If you look at the code to do this it's a pretty simple linked list traversal: https://github.com/apple-oss-distributions/xnu/blob/5c2921b0.... However, it doesn't have a check for the end of the list, so the item must be in the list, or it's going to walk right off the end and dereference a null pointer. In this case that's exactly what it ended up doing.

  • unxip

    A fast Xcode unarchiver

  • In theory I have all of those, but currently I have none, so it's manual work. Your best friend in diagnosing a kernel crash is a KDK. If you have one that matches your build, it will have symbols in it. With a little bit of math you can take the backtrace in the crash log and slide it appropriately to match the binary. Personally I use LLDB for this. Here's an example of what this looks like on an x86-64 kernel (Apple silicon has its own math but it's largely the same): https://github.com/saagarjha/unxip/issues/14#issuecomment-10.... The kernel is typically compiled with optimization, so there's a lot of inlining and code folding, but with function names, source files, and instruction offsets it's pretty trivial to match it to the code Apple publishes.

    In this case I do not have a KDK for that build. In fact Apple has been unable to produce one for a couple of months, a inadequacy which I have repeatedly emphasized to them because of how critical they are for stuff like this. Supposedly they are working on it. Whatever; in lieu of that I got to figure out how good the tooling for analyzing kernels is these days, which was my real goal anyways.

    For this crash log I downloaded the IPSW file for your build, 22A400. All of them get linked on The iPhone Wiki, e.g. https://www.theiphonewiki.com/wiki/Firmware/Mac/13.x. Once you unpack the IPSW (it's a zip file) there are compressed kernelcache files inside. Apple changed the format of these this year so most of the tooling breaks on it, but https://github.com/blacktop/ipsw was able to decompress them. Then I loaded it in to Binary Ninja, which apparently doesn't support them either but compiling this person's plugin (+166 submodules, and a LLVM & Boost build) gets it to work: https://github.com/skr0x1c0/binja_kc.

    From there you can load up the faulting address from the crash log and see what the function looks like. In this case, a bunch of junk has been inlined into it but there's a really obvious and fairly unique string reference for "invalid knote %p detach, filter: %d". From there, you can compare it against the actual source code to see which one matches the "shape" of the function you're looking at. I happened to also pull up an older kernel which did have a KDK available and then compared its assembly to the new one to match it up to ptsd_kqops_detach. The disassembly of the crashing code is obviously doing a linked list walk so you can figure out exactly which line it is from that.

    If I wasn't lazy I might also fire up a debugger to see why the function had walked off the end of the list but without KDKs things get pretty bad, not that they're very good to begin with. I don't have a m1n1 setup (I should probably do this at some point) and the things I do have, like remote debugging or the VM GDB stub, are not really worth suffering through for a Hacker News comment.

  • ipsw

    iOS/macOS Research Swiss Army Knife

  • In theory I have all of those, but currently I have none, so it's manual work. Your best friend in diagnosing a kernel crash is a KDK. If you have one that matches your build, it will have symbols in it. With a little bit of math you can take the backtrace in the crash log and slide it appropriately to match the binary. Personally I use LLDB for this. Here's an example of what this looks like on an x86-64 kernel (Apple silicon has its own math but it's largely the same): https://github.com/saagarjha/unxip/issues/14#issuecomment-10.... The kernel is typically compiled with optimization, so there's a lot of inlining and code folding, but with function names, source files, and instruction offsets it's pretty trivial to match it to the code Apple publishes.

    In this case I do not have a KDK for that build. In fact Apple has been unable to produce one for a couple of months, a inadequacy which I have repeatedly emphasized to them because of how critical they are for stuff like this. Supposedly they are working on it. Whatever; in lieu of that I got to figure out how good the tooling for analyzing kernels is these days, which was my real goal anyways.

    For this crash log I downloaded the IPSW file for your build, 22A400. All of them get linked on The iPhone Wiki, e.g. https://www.theiphonewiki.com/wiki/Firmware/Mac/13.x. Once you unpack the IPSW (it's a zip file) there are compressed kernelcache files inside. Apple changed the format of these this year so most of the tooling breaks on it, but https://github.com/blacktop/ipsw was able to decompress them. Then I loaded it in to Binary Ninja, which apparently doesn't support them either but compiling this person's plugin (+166 submodules, and a LLVM & Boost build) gets it to work: https://github.com/skr0x1c0/binja_kc.

    From there you can load up the faulting address from the crash log and see what the function looks like. In this case, a bunch of junk has been inlined into it but there's a really obvious and fairly unique string reference for "invalid knote %p detach, filter: %d". From there, you can compare it against the actual source code to see which one matches the "shape" of the function you're looking at. I happened to also pull up an older kernel which did have a KDK available and then compared its assembly to the new one to match it up to ptsd_kqops_detach. The disassembly of the crashing code is obviously doing a linked list walk so you can figure out exactly which line it is from that.

    If I wasn't lazy I might also fire up a debugger to see why the function had walked off the end of the list but without KDKs things get pretty bad, not that they're very good to begin with. I don't have a m1n1 setup (I should probably do this at some point) and the things I do have, like remote debugging or the VM GDB stub, are not really worth suffering through for a Hacker News comment.

  • binja_kc

    Plugin for loading MachO kernelcache and dSYM files to Binary Ninja

  • In theory I have all of those, but currently I have none, so it's manual work. Your best friend in diagnosing a kernel crash is a KDK. If you have one that matches your build, it will have symbols in it. With a little bit of math you can take the backtrace in the crash log and slide it appropriately to match the binary. Personally I use LLDB for this. Here's an example of what this looks like on an x86-64 kernel (Apple silicon has its own math but it's largely the same): https://github.com/saagarjha/unxip/issues/14#issuecomment-10.... The kernel is typically compiled with optimization, so there's a lot of inlining and code folding, but with function names, source files, and instruction offsets it's pretty trivial to match it to the code Apple publishes.

    In this case I do not have a KDK for that build. In fact Apple has been unable to produce one for a couple of months, a inadequacy which I have repeatedly emphasized to them because of how critical they are for stuff like this. Supposedly they are working on it. Whatever; in lieu of that I got to figure out how good the tooling for analyzing kernels is these days, which was my real goal anyways.

    For this crash log I downloaded the IPSW file for your build, 22A400. All of them get linked on The iPhone Wiki, e.g. https://www.theiphonewiki.com/wiki/Firmware/Mac/13.x. Once you unpack the IPSW (it's a zip file) there are compressed kernelcache files inside. Apple changed the format of these this year so most of the tooling breaks on it, but https://github.com/blacktop/ipsw was able to decompress them. Then I loaded it in to Binary Ninja, which apparently doesn't support them either but compiling this person's plugin (+166 submodules, and a LLVM & Boost build) gets it to work: https://github.com/skr0x1c0/binja_kc.

    From there you can load up the faulting address from the crash log and see what the function looks like. In this case, a bunch of junk has been inlined into it but there's a really obvious and fairly unique string reference for "invalid knote %p detach, filter: %d". From there, you can compare it against the actual source code to see which one matches the "shape" of the function you're looking at. I happened to also pull up an older kernel which did have a KDK available and then compared its assembly to the new one to match it up to ptsd_kqops_detach. The disassembly of the crashing code is obviously doing a linked list walk so you can figure out exactly which line it is from that.

    If I wasn't lazy I might also fire up a debugger to see why the function had walked off the end of the list but without KDKs things get pretty bad, not that they're very good to begin with. I don't have a m1n1 setup (I should probably do this at some point) and the things I do have, like remote debugging or the VM GDB stub, are not really worth suffering through for a Hacker News comment.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts