ghidra-delinker-extension
KeenWrite
ghidra-delinker-extension | KeenWrite | |
---|---|---|
6 | 98 | |
35 | 621 | |
- | - | |
7.8 | 0.0 | |
7 days ago | 8 months ago | |
Java | Java | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ghidra-delinker-extension
-
Ask HN: What rabbit hole(s) did you dive into recently?
I did, you can find the Ghidra extension there: https://github.com/boricj/ghidra-delinker-extension
The problem is properly identifying the relocations spots and their targets inside a Ghidra database, which is based on references. On x86 it's fairly easy because there's usually a 4-byte absolute or relative immediate operand within the instruction that carries the reference. On MIPS it's very hard because of split MIPS_HI16/MIPS_LO16 relocations and the actual reference can be hundreds of instructions away.
So you need both instruction flow analysis strong enough to handle large functions and code built with optimizations, as well as pattern matching for the various possible instruction sequences, some of them overlapping and others looking like regular expressions in the case of accessing multi-dimensional arrays. All of that while trying to avoid algorithms with bad worst cases because it'll take too long to run on large functions (each ADDU instruction generates two paths to analyze because of the two source registers).
Besides that, you're working on top of a Ghidra database mostly filled by Ghidra's analyzers, which aren't perfect. Incorrect data within that database, like constants mistaken for addresses, off-by-n references or missing references will lead to very exotic undefined behaviors by the delinked code unless cleaned up by hand. I have some diagnostics to help identify some of these cases, but it's very tricky.
On top of that, the delinked object file doesn't have debugging symbols, so it's challenging to figure out what's going wrong with a debugger when there's a failure. It could be an immediate segmentation fault, or the program can work without crashing but with its execution flow incorrect or generating incorrect data as output. I've thought about generating DWARF or STABS debugging data from Ghidra's database, but it sounds like yet another rabbit hole.
I'm on my fifth or sixth iteration of the MIPS analyzer, each one better than the previous one, but it's still choking on kilobytes-long functions.
Also, I've only covered 32-bit x86 and MIPS on ELF for C code. The matrix of ISAs and object file formats (ELF, Mach-O, COFF, a.out, OMF...) is rather large. C++ or Fortran would require special considerations for COMMON sections (vtables, typeinfos, inline functions, default constructors/destructors, implicit template instantiations...). This is why I think there's one or two thesis to be done here, the rabbit hole is really that deep once you start digging.
Sorry for the walls of text, but without literature on this I'm forced to build up my explanations from basic principles just so that people have a chance of following along.
-
Exploring Object File Formats
extension [1]. It's a bit finicky to get it right (toolchains assume that object files are valid and don't have much in the way of diagnostics), but these are fairly simple under the hood. Section bytes, symbols and relocations, with some headers and metadata to wrap these up...
It's a bit of a shame that object files aren't more of a lingua franca of toolchains in practice. Embedding binary blobs inside a program in a portable way is still a mess today.
[1] https://github.com/boricj/ghidra-delinker-extension/tree/mas...
- Show HN: A Ghidra extension that turns programs back into object files
-
Ask HN: Show me your half baked project
Ghidra extension for delinking programs back into object files: https://github.com/boricj/ghidra-delinker-extension
In short, this Ghidra extension allows one to reconstruct relocation tables through analysis and then export parts of programs as working object files, effectively reversing the work of a linker. Applications include binary patching, converting between object file formats, software ports without source code, decompilation projects...
I've been tinkering with it for the past 16 months or so and it's the third, hopefully industrial-grade prototype. Right now it can delink 32-bit MIPS and i386 programs from the 1990s or so to ELF object files, as long as it contains basic relocation types.
It's half-baked because while it works, it doesn't support modern instruction sets, advanced relocation types for TLS/PLT/GOT or exporting to other object file formats besides ELF, so it's not that useful on modern artifacts (which is what I assume most reverse-engineers would care about). It's not really ready for prime time because I'm not done writing blog posts that walk through real-world application and case studies ; there's very little literature out there on this esoteric topic and it can be very confusing. Like _"let's take this PlayStation PS-EXE file that was built with a COFF toolchain back in the 90s and make MIPS ELF object files out of it that work with modern Linux toolchains"_ kind of confusing.
I started this project because I wanted to decompile a PlayStation video game and quickly realized that I'd never get anywhere without a means to divide and conquer it into more manageable pieces. Ironically the decompilation project itself hasn't advanced much, but I'm having fun so far working on this.
-
Ask HN: Tell us about your project that's not done yet but you want feedback on
I've been working on a specific reverse-engineering technique called _unlinking_ [1] on-and-off for the past 16 months or so. I'm on my third prototype (first a set of Ghidra scripts written in Jython [2], then a fork of Ghidra [3] and now a Ghidra extension [4]) and I've started a blog in order to document it [5], which side-tracked into writing a whole series of articles on reverse-engineering to introduce the topic.
What for, you may ask? Basically I'm trying to decompile a PlayStation 1 video game and I've quickly decided that dealing alone with multiple +500 KiB executables of complete utter spaghetti code wasn't going to work. Instead, I've decided that I'd rather divide-and-conquer the problem, so I've been tooling up to split executables into relocatable object files, in order to decompile those one at a time and _Ship of Theseus_-style my way to success.
Ironically, all of that stuff is so not done that I don't even know what meaningful feedback there could be. My prototypes do work, but only for 32 bit little endian statically-linked MIPS executables. The articles on my blog are draft-quality. As for the decompilation project itself that started all of this, it hasn't seen much progress due to all of those side-quests. The overall topic is so esoteric that so far I've only managed to hear about one group of two persons that tried to do anything remotely similar and one another anecdotal account [6] that this particular skill is very uncommon among reverse engineers.
Personally, I'm starting to think that maybe I could've actually reverse-engineered and decompiled the game in the time I took to get here. I've also tried to engage with Ghidra to upstream the foundations of my modifications in my fork, but after some back-and-forth it became clear that my prototype-grade stuff wasn't industrial-grade and couldn't be merged in its current state, which is why I'm currently reworking the code in my fork as a Ghidra extension.
To those that want to provide feedback after reading all of this: beware, I've had a lot of fun going down that rabbit hole, but this is one hell of a time sink _and_ a particularly tricky mind-bender.
[1] I don't actually _know_ what's the actual name for this technique, given that there are so few resources on it out there. I do know I didn't invent it.
[2] https://github.com/boricj/ghidra-unlinker-scripts
[3] https://github.com/boricj/ghidra/tree/feature/elfrelocateble...
[4] https://github.com/boricj/ghidra-unlinker-extension
[5] https://news.ycombinator.com/item?id=36575081#36590078
[6] https://news.ycombinator.com/item?id=35729232&p=3#35740761
KeenWrite
-
Ask HN: Tell us about your project that's not done yet but you want feedback on
KeenWrite is my free, open-source, cross-platform desktop Markdown editor that can produce beautifully typeset PDFs. I started working on it years ago to help write a novel that has a complex timeline and I couldn't find a text editor that would allow me to integrate a character sheet with the story itself.
https://github.com/DaveJarvis/keenwrite
Tutorials:
* https://www.youtube.com/playlist?list=PLB-WIt1cZYLm1MMx2FBG9...
Here's what I mean by using variables directly:
* https://www.youtube.com/watch?v=CFCqe3A5dFg
CommonMark doesn't propose a standard for bibliographic references. Would anyone find the editor more appealing if it had cross-references and citations?
-
Documentation as Code for Cloud Using PlantUML
My cross-platform desktop text editor, KeenWrite, allows users to define variables in an external YAML file. The editor calls out to Kroki[1] to convert text-based diagrams to SVG. The diagrams can reference variables and are rendered using EchoSVG[2].
KeenWrite[3] can produce PDF documentation from Markdown documents that has PlantUML diagrams with elements stored in an external, machine-readable file. Here are screenshots showing variables on the left, diagram text in the middle, and a real-time render on the right:
* https://raw.githubusercontent.com/DaveJarvis/KeenWrite/main/...
* https://raw.githubusercontent.com/DaveJarvis/KeenWrite/main/...
KeenWrite supports all diagrams offered by Kroki, which includes "diagram-plantuml".
[1]: https://kroki.io/
[2]: https://github.com/css4j/echosvg/
[3]: https://github.com/DaveJarvis/keenwrite
- On why Markdown is not a good, or even a half-decent, markup language
- MdBook – Create book from Markdown files. Like Gitbook but implemented in Rust
- KeenWrite 3.3.2: MermaidJS diagrams (with caveat)
-
Interactive CommonMark Tutorial
Although not interactive, I've created a video series that shows advanced usage of Markdown. Namely R, external variables, diagrams, math, annotations, and a different approach to metadata:
* https://www.youtube.com/playlist?list=PLB-WIt1cZYLm1MMx2FBG9...
Tutorial 4 shows basic Markdown:
* https://www.youtube.com/watch?v=qNbGSiRzx-0
The top-right of each video shows keyboard and mouse clicks to help follow along.[1] My desktop text editor, KeenWrite[2], is used in the tutorials.
[1]: https://github.com/DaveJarvis/kmcaster
[2]: https://github.com/DaveJarvis/keenwrite
-
“Exit Traps” Can Make Your Bash Scripts Way More Robust and Reliable
https://github.com/DaveJarvis/keenwrite/blob/main/scripts/bu...
My template script provides a way to make user-friendly shell scripts. In a script that uses the template, you define the dependencies and their sources:
DEPENDENCIES=(
-
EchoSVG: SVG rasterizer library supporting level 4 selectors (Apache 2)
I didn't create the fork, nor am I affiliated with the project. I use it in my text editor, KeenWrite to rasterize SVG.
-
Millions of dollars in time wasted making papers fit journal guidelines
KeenWrite Themes[1] are instructions that tell ConTeXt how to typeset XHTML documents (content) into PDF files (presentation). I made a tutorial that shows how my FOSS desktop text editor, KeenWrite[3], allows users to write in Markdown to typeset a document against a particular theme.
Before it can be used for scientific papers, it needs cross-references, which, unfortunately, aren't part of the CommonMark specification.
I posit that the vast majority of LaTeX users don't grok how to separate content from presentation. When I asked a question on TeX.SE about how to adjust the line spacing between enumerated items (spanning a couple dozen enumerated lists), the vast majority of people voted for the answer of using `\itemsep0em` to tweak each list ... individually.[4] The correct answer, IMO, is to fix the problem globally, and not waste time tweaking individual lists.
[1]: https://github.com/DaveJarvis/keenwrite-themes
[2]: https://www.youtube.com/watch?v=3QpX70O5S30
[3]: https://github.com/DaveJarvis/keenwrite
[4]: https://tex.stackexchange.com/questions/6081/reduce-space-be...
What are some alternatives?
ansible-easy-vpn - An Ansible playbook that sets up a Wireguard server with ad blocking, DNS-over-HTTPS, and a WebUI with 2FA
markdown-preview.nvim - markdown preview plugin for (neo)vim
pls - `pls` is a prettier and powerful `ls(1)` for the pros.
marktext - 📝A simple and elegant markdown editor, available for Linux, macOS and Windows.
rosboard - ROS node that turns your robot into a web server to visualize ROS topics
typst - A new markup-based typesetting system that is powerful and easy to learn.
divedb - This is the source repository for the DiveDB site
vim-markdown - Markdown Vim Mode
nun-db - A realtime database written in rust
Zettlr - Your One-Stop Publication Workbench
Filestash - 🦄 A modern web client for SFTP, S3, FTP, WebDAV, Git, Minio, LDAP, CalDAV, CardDAV, Mysql, Backblaze, ...
kroki - Creates diagrams from textual descriptions!