Weird things I learned while writing an x86 emulator

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. Emulator

    A i386 emulator for live-booting (by FransFaase)

    Interesting read. I have a lot of respect for people who develop emulator for x86 processors. It is a complicated processor and from first hand experience I know that developing and debugging emulators for CPU's can be very challenging. In the past year, I spend some time developing a very limited i386 emulator [1] including some system calls for executing the first steps of live-bootstrap [2], primarily to figure out how it is working. I learned a lot about system calls and ELF.

    [1] https://github.com/FransFaase/Emulator/

    [2] https://github.com/fosslinux/live-bootstrap/

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. QEMU

    Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.

    Over the last year I have been rewriting QEMU's x86 decoder. I am now at a point where it should not be too hard to add APX support.

    My decoder is mostly based on the tables in the manual, and the code is mostly okay—not too much indentation and phases mostly easy to separate/identify. Nevertheless there are several cases in which the manual is wrong or doesn't say the whole story.

    The top comment explains a bit what's going on: https://github.com/qemu/qemu/blob/59084feb256c617063e0dbe7e6...

  4. fadec

    A fast and lightweight decoder for x86 and x86-64 and encoder for x86-64.

    Thanks for the pointer to QEMU's decoder! I actually never looked at it before.

    So you coded all the tables manually in C -- interesting, that's quite some effort. I opted to autogenerate the tables (and keep them as data only => smaller memory footprint) [1,2]. That's doable, because x86 encodings are mostly fairly consistent. I can also generate an encoder from it (ok, you don't need that). Re 'custom size "xh"': AVX-512 also has fourth and eighth. Also interesting that you have a separate row for "66+F2". I special case these two (CRC32, MOVBE) instructions with a flag.

    I think the prefix decoding is not quite right for x86-64: 26/2e/36/3e are ignored in 64-bit mode, except for 2e/3e as branch-not-taken/taken hints and 3e as notrack. (See SDM Vol. 1 3.3.7.1 "Other segment override prefixes (CS, DS, ES, and SS) are ignored.") Also, REX prefixes that don't immediately preceed the opcode (or VEX/EVEX prefix) are ignored. Anyhow, I need to take a closer look at the decoder with more time. :-)

    > For EVEX my plan is to keep the raw bits until after the opcode has been read

    I came to the same conclusion that this is necessary with APX. The map+prefix+opcode combination identifies how the other fields are to be interpreted. For AVX-512, storing the last byte was sufficient, but with APX, vvvv got a second meaning.

    > Nevertheless there are several cases in which the manual is wrong or doesn't say the whole story.

    Yes... especially for corner cases, getting real hardware is the only reliable way to find out, how the CPU behaves.

    [1]: https://github.com/aengelke/fadec/blob/master/instrs.txt

  5. disarm

    Disarm — Fast AArch64 Decode/Encoder

    > Other architectures, like [...] ARMv8, are much more consistent.

    From an instruction/operation perspective, AArch64 is more clean. However, from an instruction operand and encoding perspective, AArch64 is a lot less consistent than x86. Consider the different operand types: on x86, there are a dozen register types, immediate (8/16/32/64 bits), and memory operands (always the same layout). On AArch64, there's: GP regs, incremented GP reg (MOPS extension), extended GP reg (e.g., SXTB), shifted GP reg, stack pointer, FP reg, vector register, vector register element, vector register table, vector register table element, a dozen types of memory operands, conditions, and a dozen types of immediate encodings (including the fascinating and very useful, but also very non-trivial encoding of logical immediates [1]).

    AArch64 also has some register constraints: some vector operations can only encode register 0-15 or 0-7; not to mention SVE with it's "movprfx" prefix instruction that is only valid in front of a few selected instructions.

    [1]: https://github.com/aengelke/disarm/blob/master/encode.c#L19-...

  6. hugo-theme-m10c

    A minimalistic (m10c) blog theme for Hugo

    Glad you like it. I used m10c, with a few tweaks: https://github.com/vaga/hugo-theme-m10c

  7. dmd

    dmd D Programming Language compiler

    > you've written an an ARM disassembler

    Here's my AArch64 disassembler work in progress:

    https://github.com/dlang/dmd/blob/master/compiler/src/dmd/ba...

    I add to it in tandem with writing the code generator. It helps flush out bugs in both by doing this. I.e. generate the instruction, the disassemble it and compare with what I thought it should be.

    It's quite a bit more complicated than the corresponding x86 disassembler:

    https://github.com/dlang/dmd/blob/master/compiler/src/dmd/ba...

  8. LuaJIT

    Mirror of the LuaJIT git repository

    For an implementation of logical immediate encoding without the loop, see https://github.com/LuaJIT/LuaJIT/blob/04dca7911ea255f37be799...

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. shoulder

    It's probably no longer maintained, but a former colleague of mine did some work on this for C++: https://github.com/ainfosec/shoulder. Obviously if the docs are lying it doesn't help much, but there was another effort he had https://github.com/ainfosec/scapula that tried to automate detecting behavior differences between the docs and the hardware implementation.

  11. scapula

    Compare ARM CPUs Against ARM's Machine Parsable Architecture Reference Manual

    It's probably no longer maintained, but a former colleague of mine did some work on this for C++: https://github.com/ainfosec/shoulder. Obviously if the docs are lying it doesn't help much, but there was another effort he had https://github.com/ainfosec/scapula that tried to automate detecting behavior differences between the docs and the hardware implementation.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • The Hidden Bug That Crashed a Satellite: Lessons for Every Developer 🚀

    3 projects | dev.to | 10 Jan 2025
  • How JPEG XL Compares to Other Image Codecs

    3 projects | news.ycombinator.com | 26 Oct 2024
  • kubevirt VS VM-Operator - a user suggested alternative

    2 projects | 7 Sep 2024
  • Automating the Building of VMs with Packer

    7 projects | dev.to | 14 Jun 2024
  • JPEG XL and Google's War Against It

    2 projects | news.ycombinator.com | 2 May 2024