Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
riegeli
Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers.
-
msgpack
MessagePack is an extremely efficient object serialization library. It's like JSON, but very fast and small.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Seems like protoduf.js has the exact same methods as Google’s implementations with the same names (encode/decode to not prepend with length, encodeDelimited/decodeDelimited for prepending length). It is hard for me to say they’re adding to the standard when they’re just replicating Google’s libraries.
https://github.com/protobufjs/protobuf.js#toolset
I actually went through all projects listed in [1] because I remember this very quirk. It turns out that there are many such libraries that have two variants of encode/decode functions, where the second variant prepends a varint length. In my brief inspection there do exist a few libraries with only the second variant (e.g. Rust quick-protobuf), which is legitimately problematic [2].
But if the project in question was indeed protobuf.js (see loeg's comments), it clearly distinguishes encode/decode vs. encodeDelimited/decodeDelimited. So I believe the project should not be blamed, and the better question would be why so many people chose to add this exact helper. Well, because Google itself also had the same helper [3]! So at this point protobuf should just standardize this simple framing format (with an explicitly different name though), instead of claiming that protobuf has no obligation to define one.
[1] https://github.com/protocolbuffers/protobuf/blob/main/docs/t...
[2] https://github.com/tafia/quick-protobuf/issues/130
[3] https://protobuf.dev/reference/java/api-docs/com/google/prot...
[4] https://github.com/protocolbuffers/protobuf/blob/main/src/go...
I actually went through all projects listed in [1] because I remember this very quirk. It turns out that there are many such libraries that have two variants of encode/decode functions, where the second variant prepends a varint length. In my brief inspection there do exist a few libraries with only the second variant (e.g. Rust quick-protobuf), which is legitimately problematic [2].
But if the project in question was indeed protobuf.js (see loeg's comments), it clearly distinguishes encode/decode vs. encodeDelimited/decodeDelimited. So I believe the project should not be blamed, and the better question would be why so many people chose to add this exact helper. Well, because Google itself also had the same helper [3]! So at this point protobuf should just standardize this simple framing format (with an explicitly different name though), instead of claiming that protobuf has no obligation to define one.
[1] https://github.com/protocolbuffers/protobuf/blob/main/docs/t...
[2] https://github.com/tafia/quick-protobuf/issues/130
[3] https://protobuf.dev/reference/java/api-docs/com/google/prot...
[4] https://github.com/protocolbuffers/protobuf/blob/main/src/go...
It looks like Protobuf actually has a test suite to ensure compliance with the protocol:
https://github.com/bufbuild/protobuf-conformance
Seems like a good idea for protocols in general to have an official test suite, as a way to address this problem
> didn’t find any standard for separating protobuf messages
The fact that protobufs are not self-delimiting is an endless source of frustration, but I know of 2 standards:
- SerializeDelimited* is part of the protobuf library: https://github.com/protocolbuffers/protobuf/blob/main/src/go...
- Riegeli is "a file format for storing a sequence of string records, typically serialized protocol buffers. It supports dense compression, fast decoding, seeking, detection and optional skipping of data corruption, filtering of proto message fields for even faster decoding, and parallel encoding": https://github.com/google/riegeli
If it was for fun and to learn how, that's fair. But are you aware of https://ntfy.sh?
Related posts
- Eval("quire".replace(/^/,"re"))(moduleName)
- How we reverse-engineered Google Maps pagination
- Question about compressing JSON in multiplayer position updates
- Looking for ideas on how to implement serialization and deserialization between C++ and Javascript
- Protobuf-ES: The Protocol Buffers TypeScript/JavaScript runtime we all deserve