Introduction to registers
Past lessons in this series:
- Reversing 101 - introduction
- Preparing to Reverse
Lesson Objective
In this lesson we will introduce some basic definitions, explore the register organisation, and prepare for our first real program.
Remember our paradigm:
- first, you learn to build something;
- then, you learn to reverse what you built;
- finally, you learn to reverse what others build.
This lesson is more theoretical than the ones that follow. We need a precise mental model before touching real code — otherwise reversing becomes noise instead of signal.
Setup
If you haven’t done it yet, read the post Mac Malware Reversing Lab and create the virtual machine described there.
You will not need any additional tools for this lesson.
Key Concepts
Limitations
I’m assuming you already have a basic understanding of computer architecture. I won’t cover it here. You’re expected to know what a CPU is and what main memory does.
You should also look up what a data bus is — not critical right now, but useful to keep in mind.
We’re also not going to spend time on numerical representations (binary, decimal, hexadecimal). It’s an important topic, but this series would turn into an encyclopaedia if we tried to cover everything.
Definitions and concepts
We will be working at a level where the semantics of data fade away and size is all that matters. You should learn these terms now:
- 1 byte is a byte
- 2 bytes are a half-word, or “short”
- 4 bytes are a word, or “int”
- 8 bytes are a double-word, or “long”
These names come from the C world, and you will see them everywhere once you start reversing.
You will also see the acronym ISA. It stands for Instruction Set Architecture, meaning the set of instructions the CPU actually understands.
We will work with the ARM ISA, specifically the Apple-Silicon flavor of AArch64.
ARM is a RISC architecture (Reduced Instruction Set Computer): simple instructions, fixed size, predictable execution. That makes it easier to study and, more importantly, easier to reverse.
Modern ARM CPUs can run in both 32-bit and 64-bit modes. We will stay on 64-bit.
Registers
Most of the actual data processing does not happen in memory. It happens inside the CPU, in small dedicated areas called registers. You can think of them as variables, but be careful: a “variable” at this level has nothing to do with the variables you know from math or from high-level languages.
A register is just a chunk of CPU circuitry that can be accessed extremely fast. No waiting, no memory latency, no nonsense. If the CPU needs to do real work, it will do it in registers, not in RAM.
The following registers are available on a modern ARM CPU:
- X0, X1, … X30. These are the general purpose registers, and each one stores 64 bits. If you only need 32 bits, you use W0, W1, … W30 instead. A W register is simply the lower 32 bits of its X counterpart.
- X30 is the link register, also written as LR. It holds the return address when a function call happens.
- The program counter, PC, holds the address of the instruction that is being executed.
- The last register we need to mention at this point is the stack pointer, SP. It points to the top of the stack.
There are other registers, but these are more than sufficient to build our first program.
The Code
.global _main
.extern _exit
_main:
mov X0, #0x1723 // exit code
bl _exit // invokes exit
What’s Happening
This is a very simple program. It comes even before the canonical “hello boredom” because, well, we usually take program exit for granted. But that’s not the case. Even exiting a program requires programming.
Let’s dissect the code.
.global _mainmakes the symbol_mainvisible to the linker..extern _exittells the assembler/linker that we will use the_exitsymbol provided by libSystem (the macOSlibc)._main: defines the actual entry point of the program.mov X0, #0x1723stores the value 0x1723 into register X0 (which is 5923 in decimal). See the paragraph below about this form of mov.bl _exitcalls the_exitwrapper. See the paragraph below about this form ofbl.
Compiling and running
I am going to proceed “the hard way” to show you the phases of compilation. First the assembly code must be compiled. Assuming you have saved the source as main.s:
as -arch arm64 -o main.o main.s
This creates the object file, main.o. Now we need to link it to the - libSystem library, because it’s where _exit is defined. Therefore:
ld -arch arm64 \
-syslibroot $(xcrun --sdk macosx --show-sdk-path) \
-o ASM_o_DETH main.o \
-lSystem
This works directly because _main is the default entry point on macOS binaries. Were you trying to emulate these lessons on Linux, you should replace it with _start. No big deal, anyway. In fact, the name is quite arbitrary, you could also compile the following code:
.global _santaclaus
.extern _exit
_santaclaus:
mov X0, #0x1723 // exit code
bl _exit // invokes exit
provided that you supply the linker with the entry point:
as -arch arm64 -o santa.o santa.s
SDK=$(xcrun --sdk macosx --show-sdk-path)
ld -arch arm64 \
-syslibroot $SDK \
-e _santaclaus \
-o santa \
santa.o \
-lSystem
This way, defining a variable upfront, is way more practical and easier to read.
Demo
Wrapping up
I know this was a dense lesson — intentionally dense. Take your time, experiment, and let the concepts settle.
Exceptionally, I’m splitting this introduction into two parts. The second half will be out in two days. Make sure you’re fully comfortable with everything up to this point before moving on.
Next Lesson
Next time we’ll look at mov and bl properly, walk through what X0 to X8 actually do inside the ABI, and finish with your first real experiment: the same ARM64 program on macOS and on Linux, with very different outcomes.
See you in two days. Have fun.
Want the deep dive?
If you’re a security researcher, incident responder, or part of a defensive team and you need the full technical details (labs, YARA sketches, telemetry tricks), email me at info@bytearchitect.io or DM me on X (@reveng3_org). I review legit requests personally and will share private analysis and artefacts to verified contacts only.
Prefer privacy-first contact? Tell me in the first message and I’ll share a PGP key.
Subscribe to The Byte Architect mailing list for release alerts and exclusive follow-ups.
Gabriel(e) Biondo
ByteArchitect · RevEng3 · Rusted Pieces · Sabbath Stones