|Scala Parser Combinators||Kaitai Struct|
|11 days ago||about 2 months ago|
|Apache License 2.0||GPL-3.0-or-later|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Scala Parser Combinators
thoroughful tutorial for scala.util.parsing.combinator._
2 projects | reddit.com/r/scala | 28 Apr 2022
If you find anything that isn't already linked from the README at https://github.com/scala/scala-parser-combinators , please PR the addition to the readme.
-🎄- 2021 Day 18 Solutions -🎄-
144 projects | reddit.com/r/adventofcode | 17 Dec 2021
Mostly a mess of pattern matching. I really need to make some generic tree utilities. Haven't been able to find a decent parser combinator that works in Scala 3 (I usually use fastparse which depends heavily on Scala 2 macros, and scala-parser-combinators works in Scala 3, but I've had a lot of trouble getting it to not be too greedy), so I used the state monad from cats to parse at the bottom of the file, which I think turned out fairly nice.
Ask HN: What software do you use to examine binary files?
4 projects | news.ycombinator.com | 5 Sep 2022
There are a few hex/disk editors that support "templates" (but you need most times to create those yourself).
Here is a sort of "curated list" of related tools:
The most complete/populated I know of is Kaitai:
that you can use with Hiew with Kiewtai
If the question is slightly different, i.e. which bytes are used to identify a given file format, there is Trid:
Which has also a database of known headers/patterns.
Invisible XML is a language for describing the implicit structure of data
2 projects | news.ycombinator.com | 16 Jul 2022
I don't get the impression this is designed for binary formats, merely "non XML" ones. The task you described sounds like a better fit for https://kaitai.io/
Is there any good binary serializer & deserializer for C / C++?
5 projects | reddit.com/r/cpp_questions | 4 Jun 2022
I'm aware there is Kaitai Struct which can handle binary parsing (deserializing). And I have some success previously with python Construct which can do both serialize & deserialize, but it's written in python.
help with caviar voxel format
2 projects | reddit.com/r/VoxelGameDev | 26 May 2022
It seems like an interesting project. I've written a kaitai specification for the format based on the wiki page you linked. Kaitai is a DSL for describing binary file formats with library bindings for C++ and several other languages. It should save quite some time compared to manually writing a decoder for the file format.
Japanese Words in Neuromancer
6 projects | news.ycombinator.com | 8 May 2022
This reminds of a list I’ve been compiling for the past couple of years: English-language software or products with names taken from Japanese. I find them interesting because there has long been awareness, discussion, and controversy in Japan about the the opposite phenomenon—English words used in Japanese.
The following examples all came from HN:
Koi Pond, a load testing tool. Koi (鯉) means “carp.”
Anki, a flash card tool. Often mentioned in HN discussions. Anki (暗記) means “memorization.”
Bento, a framework for development of Linux kernel file systems. A bento (弁当) is a meal in a box.
Umami, a website analytics tool. Umami (旨味)’s original meaning is “taste, flavor, deliciousness”; it now also refers to a particular basic taste sensation.
Senpai, a gaming assistant. Senpai (先輩) means “someone senior to or older than one, typically in an educational or workplace hierarchy.”
Shodan, a search engine. Shodan (初段) means “first-level ranking in a skill, etc.).”
YubiKey, an authentication device. Yubi (指) means “finger.”
Asahi Linux. Asahi (朝日, 旭) means “morning sun.”
Neko, a virtual browser. Neko (猫) means “cat.”
Kaitai Struct, a declarative language for binary data structures. Kaitai (解体) means “disassembly.”
Hikari, a custom logon script engine for Windows. Hikari (光) means “light.”
Hikari, a Wayland compositor.
Hikari, a thread manager and dispatcher.
Alternatives to Spicy protocol parser generator
2 projects | reddit.com/r/embedded | 2 Apr 2022
- Proltr (https://www.protlr.com/) - proprietary - Meta II (https://news.ycombinator.com/item?id=13039981) - haven't yet looked into it - Ragel (http://www.colm.net/open-source/ragel/) - seems promising - Lemon (https://sqlite.org/src/doc/trunk/doc/lemon.html) - Alternative to yacc. Seems promising, but requres tweaking - Katai (https://kaitai.io/) - desktop.
Why isn't there a Swagger/OpenAPI for binary formats?
13 projects | news.ycombinator.com | 25 Mar 2022
Maybe kaitai struct is what you are looking for? ;)13 projects | news.ycombinator.com | 25 Mar 2022
you are looking for katai struct: https://kaitai.io/
katai allows you to declaratively define a binary format bit for bit - they even have an IDE
Kaitai Struct: A new way to develop parsers for binary structures
12 projects | news.ycombinator.com | 17 Mar 2022
I contributed a number of file formats a few years ago (and attempted numerous others) but ran into a number of problems with certain file formats:
1. It's not possible to read from the file until a multiple byte termination sequence is detected. 
2. You can't read sections of a file where the termination condition is the presence of a sequence of bytes denoting the next unrelated section of the file (and you don't want to consume/read these bytes) 
3. The WebIDE at the time couldn't handle very large file format specifications such as Photoshop (PSD) 
4. Files containing compressed or encrypted sections require a compression/encryption algorithm to be hardcoded into Kaitai struct libraries for each programming language it can output to.
The WebIDE I particularly liked as it makes it easy to get started and share results. I also liked how Kaitai Struct allows easy definition of constraints (simple ones at least) into the file format specification so that you can say "this section of the file shall have a size not exceeding header.length * 2 bytes".
Some alternative binary file format specification attempts for those interested in seeing alternatives, each with their own set of problems/pros/cons:
1. 010 Editor 
2. Synalysis 
3. hachoir 
4. DFDL 12 projects | news.ycombinator.com | 17 Mar 2022
What are some alternatives?
Protobuf - Protocol Buffers - Google's data interchange format
csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
Fast Parse - Writing Fast Parsers Fast in Scala
rizin - UNIX-like reverse engineering framework and command-line toolset.
Camelot - A Python library to extract tabular data from PDFs
PDFMiner - Python PDF Parser (Not actively maintained). Check out pdfminer.six.
tablib - Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
unoconv - Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
Scopt - command line options parsing for Scala