restruct
HTTP Parser
DISCONTINUED
Our great sponsors
restruct | HTTP Parser | |
---|---|---|
3 | 8 | |
345 | 6,115 | |
1.7% | - | |
3.2 | 0.0 | |
almost 2 years ago | almost 2 years ago | |
Go | C | |
ISC License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
restruct
-
Why isn't there a Swagger/OpenAPI for binary formats?
My project Restruct[1] does Kaitai-like things but also supports serialization. Unfortunately, it only supports Go and only deals with Go struct tags rather than YAML manifests. Still, it totally can be used for serialization. I use it to sketch out quick projects against arbitrary binary formats. Two examples: one, parsing PNG headers to implement a quick binwalk-like program for just PNG that looks for the IEND chunk to extract accurately[2], two, a program that splits FL Studio FLP projects by playlist track[3].
I feel like I’ve self-promoted Restruct like four times on Hacker News, and I feel kind of bad because it could use improvements and even some bug fixes and I never seem to get around to it. Oh well. It’s still useful for me, I hope it’s useful for others, too.
That said, Kaitai has a fairly clear path towards adding serialization from a design PoV; many things that would be calculated for parsing structures in deserialization could just become checks/assertions in serialization. As an example, checking that an expression calculates out to the expected value would be a reasonable approach. Reversible expressions could be implemented for some cases, too, if you want it to do more of the heavy lifting. I think the biggest obstacle is actually implementing it, and frankly my Scala is too weak to help with such a relatively big undertaking.
I’ve also played with the rust nom library, which implements functional programming style parser combinators. It is quite cool how it can express fairly complex grammars and binary formats pretty much equally well, albeit optimizing it effectively requires serious magic that I do not think nom has. (I assume in Haskell, the same thing can be done with mind-boggling optimization power.)
[1]: https://github.com/go-restruct/restruct
-
Kaitai Struct: A new way to develop parsers for binary structures
I’m a big fan of Kaitai Struct, to the point where I’ve even contributed a small bit of improvements to its Go support, and I use it in a handful of small projects. It’s indispensable for spelunking blobs of binary data.
I’ve also taken some inspiration with a Go library I wrote, restruct:
https://github.com/go-restruct/restruct
… which is a bit like Go’s JSON encoding/decoding library, but with kaitai-like annotations for binary encoding. (Check the PNG example to see some of what can be done with it.)
-
Plain Text Protocols
Honestly, I dislike plaintext formats a lot. It is more accessible because it’s human readable. But, this only extends to humans who happen to speak the language the protocol uses for keywords. While it’s not a huge ask, I still suggest this is mostly not that interesting of a benefit.
Parsing and emitting plaintext formats, meanwhile, is a rabbit hole. It’s human readable which makes you tempted to make it human writable. Should you accept extraneous whitespace? Tabs vs spaces? Terminating new line? Unix or DOS line endings? Etc.
Binary data may seem less accessible, but I blame the libraries. There’s tons of easy ways to parse text. You can use string.split, atoi and scanf in your language of choice. What is there for binary?
In Go, the encoding/binary package actually implements something really cool. A simple reflection-based mechanism that can read and write binary data into a structure in a defined and simple way.
lunixbochs extended this to struc[1], which adds additional tags for advanced reading and writing of binary structures, including variable length structures. I went further and maybe a bit off into the deep end with Restruct[2], a similar concept but with a lot more features, designed specifically so I could handle advanced structures quickly.
The end result is that I can define some Go structs with integers, strings, byte arrays and corresponding tags, and be able to serialize and deserialize from those structures to their corresponding binary representation. For an overdone demo of what you could do with Restruct for example, see this (incomplete) PNG demo: https://github.com/go-restruct/restruct/blob/master/formats/... (It is mainly incomplete because I had moved focus to develop a codegen for restruct, to improve runtime performance, although such work has since stalled.)
HTTP Parser
-
eBPF will help solve service mesh by getting rid of sidecars
It looks not too different from the majority of HTTP parsers out there written in C. Here is an example of NodeJS [0].
[0] https://github.com/nodejs/http-parser/blob/main/http_parser....
-
C in Web Dev
NodeJS's HTTP parser used to be a handwritten C lib: http-parser
-
The history and reasons behind CORS, and how to use it
Whoa, I didn't know that! But yeah, it seems like https://github.com/nodejs/http-parser is based on nginx. It now uses https://github.com/nodejs/llhttp but has some of the same legacy.
On the other hand, deno's HTTP stuff is built on top of Hyper, a Rust library https://github.com/hyperium/hyper
-
How to pass ownership of std::function object to function pointer?
For cases where it is necessary to pass local information to/from a callback, the http_parser object's data field can be used.
From nodejs http-parser documentation:
-
Plain Text Protocols
Legacy HTTP/1.1 suffers a few issues, see the current RFC errata:
https://www.rfc-editor.org/errata_search.php?rfc=7230&rec_st...
There are issues particularly around how whitespace and obsolete line folding should be handled
https://github.com/nodejs/http-parser/issues?q=is%3Aissue+wh...
https://github.com/httpwg/http-core/issues/53
It's not as trivial as a few string splits.
What are some alternatives?
llhttp - Port of http_parser to llparse
C++ Format - A modern formatting library
American Fuzzy Lop - american fuzzy lop - a security-oriented fuzzer
semver.c - Semantic version in ANSI C
PHP CPP - Library to build PHP extensions with C++
stb - stb single-file public domain libraries for C/C++
Klib - A standalone and lightweight C library
ZXing - ZXing ("Zebra Crossing") barcode scanning library for Java, Android
ZBar - Clone of the mercurial repository http://zbar.hg.sourceforge.net:8000/hgroot/zbar/zbar
leaf - Lightweight Error Augmentation Framework
SLRE - Super Light Regexp engine for C/C++
value-category-cheatsheet - A C++14 cheat-sheet on lvalues, rvalues, xvalues, and more