Kaitai Struct
restruct
Kaitai Struct | restruct | |
---|---|---|
44 | 3 | |
3,839 | 345 | |
1.1% | 0.0% | |
7.5 | 3.2 | |
20 days ago | about 2 years ago | |
Shell | Go | |
GPL-3.0-or-later | ISC License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Kaitai Struct
- Reverse-engineering an encrypted IoT protocol
-
Parsing an Undocumented File Format
- ImHex [2], which has a pattern language [3] which allows parsing, and it seems more powerful than what Kaitai offers. I stumbled upon some limitations with it but it was still useful.
[1]: https://kaitai.io/
- Kaitai Struct – a declarative language used to describe binary data structures
-
HTTPie Desktop: cross-platform API testing client for humans
Beautiful. Didn't know something like this exists. Reminds me of Katai[0]
[0]. https://kaitai.io/
-
Hacking the LG Monitor's EDID
An EDID override like this would be helpful for macOS as well, where the monitors swapping around after standby is a real annoyance [0] [1]
EDID rewrites are 99% of the time blocked by the monitor firmware: https://notes.alinpanaitiu.com/Decoding-monitor-EDID-on-macO...
By the way, one helpful tool that helped me navigate the EDID dump was Kaitai Struct [2]. It shows a side by side view with the hex view and the EDID structure, and it highlights the hex values in real time as you navigate the structure. Unfortunately [3] it doesn't support the extension blocks that the author needs.
[0] https://notes.alinpanaitiu.com/Weird-monitor-bugs
[1] https://forums.macrumors.com/threads/external-displays-swapp...
[2] https://kaitai.io/
[3] https://github.com/kaitai-io/edid.ksy
- Kaitai Struct: new way to develop parsers for binary structures
-
Fq: Jq for Binary Formats
Kaitai Struct might be a good choice for that: https://kaitai.io/
-
Ingesting, parsing and making sense of device log data
For binary log format, there's the excellent Kaitai Struct frameworks, that make it very easy to generate parsers from a declarative schema
-
What is this tool? More info in comments
kaitai
-
Visual Programming with Elixir: Learning to Write Binary Parsers (2019)
https://kaitai.io/
Worth a look if you are writing binary parsers.
restruct
-
Why isn't there a Swagger/OpenAPI for binary formats?
My project Restruct[1] does Kaitai-like things but also supports serialization. Unfortunately, it only supports Go and only deals with Go struct tags rather than YAML manifests. Still, it totally can be used for serialization. I use it to sketch out quick projects against arbitrary binary formats. Two examples: one, parsing PNG headers to implement a quick binwalk-like program for just PNG that looks for the IEND chunk to extract accurately[2], two, a program that splits FL Studio FLP projects by playlist track[3].
I feel like I’ve self-promoted Restruct like four times on Hacker News, and I feel kind of bad because it could use improvements and even some bug fixes and I never seem to get around to it. Oh well. It’s still useful for me, I hope it’s useful for others, too.
That said, Kaitai has a fairly clear path towards adding serialization from a design PoV; many things that would be calculated for parsing structures in deserialization could just become checks/assertions in serialization. As an example, checking that an expression calculates out to the expected value would be a reasonable approach. Reversible expressions could be implemented for some cases, too, if you want it to do more of the heavy lifting. I think the biggest obstacle is actually implementing it, and frankly my Scala is too weak to help with such a relatively big undertaking.
I’ve also played with the rust nom library, which implements functional programming style parser combinators. It is quite cool how it can express fairly complex grammars and binary formats pretty much equally well, albeit optimizing it effectively requires serious magic that I do not think nom has. (I assume in Haskell, the same thing can be done with mind-boggling optimization power.)
[1]: https://github.com/go-restruct/restruct
[2]: https://github.com/jchv/pngextract
[3]: https://github.com/jchv/flsplit
-
Kaitai Struct: A new way to develop parsers for binary structures
I’m a big fan of Kaitai Struct, to the point where I’ve even contributed a small bit of improvements to its Go support, and I use it in a handful of small projects. It’s indispensable for spelunking blobs of binary data.
I’ve also taken some inspiration with a Go library I wrote, restruct:
https://github.com/go-restruct/restruct
… which is a bit like Go’s JSON encoding/decoding library, but with kaitai-like annotations for binary encoding. (Check the PNG example to see some of what can be done with it.)
-
Plain Text Protocols
Honestly, I dislike plaintext formats a lot. It is more accessible because it’s human readable. But, this only extends to humans who happen to speak the language the protocol uses for keywords. While it’s not a huge ask, I still suggest this is mostly not that interesting of a benefit.
Parsing and emitting plaintext formats, meanwhile, is a rabbit hole. It’s human readable which makes you tempted to make it human writable. Should you accept extraneous whitespace? Tabs vs spaces? Terminating new line? Unix or DOS line endings? Etc.
Binary data may seem less accessible, but I blame the libraries. There’s tons of easy ways to parse text. You can use string.split, atoi and scanf in your language of choice. What is there for binary?
In Go, the encoding/binary package actually implements something really cool. A simple reflection-based mechanism that can read and write binary data into a structure in a defined and simple way.
lunixbochs extended this to struc[1], which adds additional tags for advanced reading and writing of binary structures, including variable length structures. I went further and maybe a bit off into the deep end with Restruct[2], a similar concept but with a lot more features, designed specifically so I could handle advanced structures quickly.
The end result is that I can define some Go structs with integers, strings, byte arrays and corresponding tags, and be able to serialize and deserialize from those structures to their corresponding binary representation. For an overdone demo of what you could do with Restruct for example, see this (incomplete) PNG demo: https://github.com/go-restruct/restruct/blob/master/formats/... (It is mainly incomplete because I had moved focus to develop a codegen for restruct, to improve runtime performance, although such work has since stalled.)
[1]: https://pkg.go.dev/github.com/lunixbochs/struc
[2]: https://pkg.go.dev/github.com/go-restruct/restruct
What are some alternatives?
Protobuf - Protocol Buffers - Google's data interchange format
binrw - A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
cantordust - Public repository for Cantordust Ghidra plugin.
Camelot - A Python library to extract tabular data from PDFs
kaitai-to-wireshark - Converts a Kaitai Struct file description to a Wireshark LUA plugin
tablib - Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
HTTP Parser - http request/response parser for c
PDFMiner - Python PDF Parser (Not actively maintained). Check out pdfminer.six.
RecordFlux - Formal specification and generation of verifiable binary parsers, message generators and protocol state machines
PyYAML
wuffs - Wrangling Untrusted File Formats Safely