Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Kaitai Struct
Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
-
ImHex
🔍 A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.
3) try to reproduce the artifacts with a NodeJS script.
A simple `xlsx2csv` NodeJS script generates CSV text from a XLSX file, and a simple `diff` would reveal any deviations from the expected result.
No one in the preceding 30 years thought to do the same! Many open source projects including OpenOffice/LibreOffice had huge collections of sample XLS and XLSX artifacts without accompanying plaintext artifacts.
We wrote a series of scripts to automate Excel and generate the desired artifacts. https://github.com/SheetJS/test_files/blob/master/tests/txt.... is a AppleScript automation script for Excel 2011 for Mac.
Those tests have revealed a number of unexpected bugs in third-party tools and regressions in Excel itself. For example, Excel 5.0 introduced the datetime format `yyyy-mm-dd [hh]:mm:ss`. The value 0.001 is expected to be rendered as "1900-01-00 00:01:26", and Excel 5.0 - 2003 worked as expected. Excel 2007 changed number formatting and newer versions show the nonsensical result "1900-01-00 645:01:26" (it is nonsensical since the value 0.001 represents less than one hour)
I am trying to parse the data tables from the undocumented MS Access file format using NodeJS/Javascript. I last tried about 3 years ago and it was really tough going, with a lot of trial and error spread out over several months. Anyway, I managed to be able to parse some basic MS Access files, but need to figure out a way to get the whole database more reliably. My effort was here:
https://github.com/yazz/noaccess
- ImHex [2], which has a pattern language [3] which allows parsing, and it seems more powerful than what Kaitai offers. I stumbled upon some limitations with it but it was still useful.
[1]: https://kaitai.io/