tablib
Kaitai Struct
Our great sponsors
tablib | Kaitai Struct | |
---|---|---|
2 | 32 | |
4,220 | 3,278 | |
0.7% | 1.9% | |
4.2 | 7.4 | |
6 days ago | 6 months ago | |
Python | Shell | |
MIT License | GPL-3.0-or-later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tablib
-
Fun with File Formats
There are two problems leading to the decision of only accepting public domain info: licensing and provenance.
"Licensing" is hard. The "Open Specifications Promise" [1], which covers a bunch of Microsoft-designed file formats, is merely a covenant not to sue.
"Provenance" is tricky. For example, much of the knowledge of the Apple iWork formats were derived by reverse-engineering the source programs and extracting protobuf definitions. Many open source projects have freely copied from each other, making detailed analysis tricky [2].
[1] https://en.wikipedia.org/wiki/Microsoft_Open_Specification_P...
Kaitai Struct
-
What projects are you working on or planning to do this year?
Speaking of reading binary data, later I found Kaitai Struct. You can write file format in YAML, then it would transpile it to different programming languages. Highly recommend it if all you need is to read (it can not serialize data back to binary). There's even a web IDE to play with files, which is quite fun.
-
TreeSheets: Open-Source Free Form Data Organizer (Hierarchical Spreadsheet)
Have you been exposed to Kaitai Struct[0] yet? As someone who wanted to use binary data from programs I’ve used it quite successfully and even more to the point: quite happily
-
Hey Rustaceans! Got a question? Ask here! (49/2022)!
Optimal solution for me would be something like Kaitai Struct with rust generators and serialization support.
-
Favorite hidden gem library?
And I love https://kaitai.io/ for data parsing. Not commonly required, but when it is, it's so good!
-
Ask HN: What software do you use to examine binary files?
There are a few hex/disk editors that support "templates" (but you need most times to create those yourself).
Here is a sort of "curated list" of related tools:
https://github.com/dloss/binary-parsing
The most complete/populated I know of is Kaitai:
that you can use with Hiew with Kiewtai
https://github.com/taviso/kiewtai
If the question is slightly different, i.e. which bytes are used to identify a given file format, there is Trid:
https://mark0.net/soft-trid-e.html
Which has also a database of known headers/patterns.
-
Invisible XML is a language for describing the implicit structure of data
I don't get the impression this is designed for binary formats, merely "non XML" ones. The task you described sounds like a better fit for https://kaitai.io/
-
Is there any good binary serializer & deserializer for C / C++?
I'm aware there is Kaitai Struct which can handle binary parsing (deserializing). And I have some success previously with python Construct which can do both serialize & deserialize, but it's written in python.
-
help with caviar voxel format
It seems like an interesting project. I've written a kaitai specification for the format based on the wiki page you linked. Kaitai is a DSL for describing binary file formats with library bindings for C++ and several other languages. It should save quite some time compared to manually writing a decoder for the file format.
-
Japanese Words in Neuromancer
This reminds of a list I’ve been compiling for the past couple of years: English-language software or products with names taken from Japanese. I find them interesting because there has long been awareness, discussion, and controversy in Japan about the the opposite phenomenon—English words used in Japanese.
The following examples all came from HN:
Koi Pond, a load testing tool. Koi (鯉) means “carp.”
https://slack.engineering/load-testing-with-koi-pond/
Anki, a flash card tool. Often mentioned in HN discussions. Anki (暗記) means “memorization.”
Bento, a framework for development of Linux kernel file systems. A bento (弁当) is a meal in a box.
https://arxiv.org/abs/2005.09723
Umami, a website analytics tool. Umami (旨味)’s original meaning is “taste, flavor, deliciousness”; it now also refers to a particular basic taste sensation.
Senpai, a gaming assistant. Senpai (先輩) means “someone senior to or older than one, typically in an educational or workplace hierarchy.”
Shodan, a search engine. Shodan (初段) means “first-level ranking in a skill, etc.).”
YubiKey, an authentication device. Yubi (指) means “finger.”
Asahi Linux. Asahi (朝日, 旭) means “morning sun.”
Neko, a virtual browser. Neko (猫) means “cat.”
Kaitai Struct, a declarative language for binary data structures. Kaitai (解体) means “disassembly.”
Hikari, a custom logon script engine for Windows. Hikari (光) means “light.”
https://github.com/NoenDex/Hikari
Hikari, a Wayland compositor.
https://hikari.acmelabs.space/
Hikari, a thread manager and dispatcher.
https://artificialilliteracy.wordpress.com/2015/06/27/introd...
-
Alternatives to Spicy protocol parser generator
- Proltr (https://www.protlr.com/) - proprietary - Meta II (https://news.ycombinator.com/item?id=13039981) - haven't yet looked into it - Ragel (http://www.colm.net/open-source/ragel/) - seems promising - Lemon (https://sqlite.org/src/doc/trunk/doc/lemon.html) - Alternative to yacc. Seems promising, but requres tweaking - Katai (https://kaitai.io/) - desktop.
What are some alternatives?
Protobuf - Protocol Buffers - Google's data interchange format
csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
Camelot - A Python library to extract tabular data from PDFs
pymorphy2 - Morphological analyzer / inflection engine for Russian and Ukrainian languages.
rizin - UNIX-like reverse engineering framework and command-line toolset.
PyYAML
Fast Parse - Writing Fast Parsers Fast in Scala
PDFMiner - Python PDF Parser (Not actively maintained). Check out pdfminer.six.
unoconv - Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
Scopt - command line options parsing for Scala
python-docx - Create and modify Word documents with Python
Python-Markdown - A Python implementation of John Gruber’s Markdown with Extension support.