Reverse Engineering Protobuf Definitions from Compiled Binaries

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • grpc-go

    The Go language implementation of gRPC. HTTP/2 based RPC

  • The reflection service is open-sourced (at least for some sdks):

    * https://github.com/grpc/grpc-go/blob/master/Documentation/se...

    * https://chromium.googlesource.com/external/github.com/grpc/g...

  • Protobuf

    Protocol Buffers - Google's data interchange format

  • For at least 4 years protobuf has had decent support for self-describing messages (very similar to avro) as well as reflection

    https://github.com/protocolbuffers/protobuf/blob/main/src/go...

    Xgooglers trying to make do on the cheap will just create a Union of all their messages and include the message def in a self-describing message pattern. Super-sensitive network I/O can elide the message def (empty buffer) and any for RecordIO clone well file compression takes care of the definition.

    Definitely useful to be able to dig out old defs but protobuf maintainers have surprisingly added useful features so you don’t have to.

    Bonus points tho for extracting the protobuf defs that e.g. Apple bakes into their binaries.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ProtobufDecoder

    A Google Protocol Buffers (Protobuf) payload decoder/analyzer

  • Decoding protobuf (and message formats in general) can be such a pain and fun at the same time.

    I’ve written ProtobufDecoder which takes a different approach: analyze the structure of the actual messages to help you figure out the protobuf structure of a message.

    https://github.com/sandermvanvliet/ProtobufDecoder

  • protobuf-inspector

    🕵️ Tool to reverse-engineer Protocol Buffers with unknown definition

  • I have used this other tool to good effect to reverse engineer a file format based on Protobuf: https://github.com/mildsunrise/protobuf-inspector

  • gRPC

    The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)

  • Yes, grpc_cli tool uses essentially the same mechanism except implemented as a grpc service rather than as a stubby service. The basic principle of both is implementing the C++ proto library's DescriptorDatabase interface with cached recursive queries of (usually) the server's compiled in FileDescriptorProtos.

    See also https://github.com/grpc/grpc/blob/master/doc/server-reflecti...

    The primary difference between what grpc does and what stubby does is that grpc uses a stream to ensure that the reflection requests all go to the same server to avoid incompatible version skew and duplicate proto transmissions. With that said, in practice version skew is rarely a problem for grpc_cli style "issue a single RPC" usecases: even if requests do go to two or more different versions of a binary that might have incompatible proto graphs, it is very common for the request and response and RPC to all be in the same proto file so you only need to make one RPC in the first place unless you're using an extension mechanism like proto2 extensions or google.protobuf.Any.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts