Any standard algorithms for parsing (disassembling) machine code?

This page summarizes the projects mentioned and recommended in the original post on /r/compsci

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • PyVM

    A virtual machine written in Python that executes x86 binaries according to the Intel Software Developer Manual (by ForceBru)

  • Back in the day, I wrote this x86 emulator just for fun: https://github.com/ForceBru/PyVM, and to this day, my implementation of instruction parsing (disassembly) is bugging me because it's a mess and doesn't seem correct at all, even though it kind of works. However, after a couple of years of occasionally trying to find some kind of "proper" algorithm for machine code disassembly, I couldn't find anything... noteworthy, or well-known, or widely used.

  • bap

    Binary Analysis Platform

  • BAP (https://github.com/binaryanalysisplatform/bap), angr (https://angr.io/) and others already do what you're asking for as more purpose-built solutions for dynamic analysis. Angr specifically in python.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • angr

    A powerful and user-friendly binary analysis platform!

  • BAP (https://github.com/binaryanalysisplatform/bap), angr (https://angr.io/) and others already do what you're asking for as more purpose-built solutions for dynamic analysis. Angr specifically in python.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts