Low Level Etude One – Hello Worlds (Part 2)
Hello World - puts or printf?
bl - branch with link
Let's get back on track and learn about
bl. Consider the following simple program:
.globl _start .p2align 2 say_hello: mov x0, #1 adrp x1, msg@PAGE add x1, x1, msg@PAGEOFF adrp x2, msg_sz@PAGE add x2, x2, msg_sz@PAGEOFF ldr x2, [x2] mov x16, #4 svc 0x80 _start: b say_hello mov x16, #1 svc 0x80 .data msg: .asciz "Hello World!" msg_sz: .word .-msg-1
If you build and start this it will print
Hello World forever. The
b - branch instruction will jump to
say_hello and continue execution after
say_hello with the next line which is the same branch instruction, thus repeating forever.
So we need to change
bl - branch with link and at the end of the
say_hello block we add a
ret instruction. Now the execution will continue right after the
bl say_hello instruction. This happens because
bl saves the address of the next instruction into the
lr register and
ret jumps to the address saved in the
But! What if we override the
lr registers content with another
bl instruction? Let's add the following:
.globl _start .p2align 2 print_newline: mov x0, #1 adrp x1, newline@PAGE add x1, x1, newline@PAGEOFF mov x2, #1 mov x16, #4 svc 0x80 ret say_hello: mov x0, #1 adrp x1, msg@PAGE add x1, x1, msg@PAGEOFF adrp x2, msg_sz@PAGE add x2, x2, msg_sz@PAGEOFF ldr x2, [x2] mov x16, #4 svc 0x80 bl print_newline ret _start: bl say_hello mov x16, #1 svc 0x80 .data msg: .asciz "Hello World!" msg_sz: .word .-msg-1 .align 4 newline: .asciz "\n"
Can you already see the problem? With
bl print_newline we save another address to the
lr register and overwrite what was already saved. So once we call
print_newline we'll fall on the
ret instruction at the end of
say_hello which is another
ret statement that will jump to that very location, again. So we're in an endless loop.
The easy fix is to just save the content of the
lr register before the
bl instruction and restore if before the
say_hello: mov x0, #1 adrp x1, msg@PAGE add x1, x1, msg@PAGEOFF adrp x2, msg_sz@PAGE add x2, x2, msg_sz@PAGEOFF ldr x2, [x2] mov x16, #4 svc 0x80 mov x3, lr bl print_newline mov lr, x3 ret
Beware that this will only work if the code that we jump into will not fiddle with the
x3 register that we used to save the content of the
Function call convention
So what is the right way to make a proper function call in arm64 assembler? Stephens Book has a nice summary:
For the calling routine:
- Save registers
x0 - x18if you use them.
- Move the first eight parameters into the registers
x0 - x7. Functions with varadic parameters might be handled differently, we'll come to that
- Push additional parameters on the stack.
blto call the function.
- Evalute the return code in
x0 - x18, if needed.
For the called function:
x19 - x30onto the stack if used in the routine.
- Do the work.
- Put return code in
x19 - x30if pushed in step 1.
So that's no quite what we have been doing. Let's double check what clang did for us, again:
.LC0: .string "Hello World!" main: stp x29, x30, [sp, -16]! mov x29, sp adrp x0, .LC0 add x0, x0, :lo12:.LC0 bl puts ldp x29, x30, [sp], 16 ret
The first line stores a pair (st ore p air) of registers on the stack after subtracting 16 from
sp is the stack pointer that holds the currect position of the stack. Since the stack grows in negative direction subtracting 16 makes room to save the contents of two registers.
Let's check what the two registers are. Start
lldb two and enter
b start and
r. Now type
General Purpose Registers: x0 = 0x0000000000000001 x1 = 0x000000016fdff618 x2 = 0x000000016fdff628 x3 = 0x000000016fdff778 [...] x27 = 0x0000000000000000 x28 = 0x0000000000000000 fp = 0x000000016fdff5f0 lr = 0x000000010000d08c dyld`start + 520 sp = 0x000000016fdff4b0 pc = 0x0000000100003f8c zero`start cpsr = 0x60001000
x29 is the frame pointer and
x30 is the link register. Let's explore this with a minimal sample program:
.globl _start .p2align 2 _start: stp fp, lr, [sp, -16]! mov fp, sp ; work ldp fp, lr, [sp], 16 ret
Build the program and start it in the debugger breaking on start. Let's explore the stack:
(lldb) re r sp sp = 0x000000016fdff460 (lldb) m read 0x000000016fdff460 0x16fdff460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0x16fdff470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Execute the first instruction and check again:
(lldb) re r sp sp = 0x000000016fdff450 (lldb) m read 0x000000016fdff450 0x16fdff450: a0 f5 df 6f 01 00 00 00 8c d0 00 00 01 00 00 00 ...o............ 0x16fdff460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Let's check if that looks like it should:
(lldb) re r fp fp = 0x000000016fdff5a0 (lldb) re r lr lr = 0x000000010000d08c dyld`start + 520
Looks pretty good. Indeed the stack pointer got decremented by 16 bytes making room for the two 8 byte values saved in
lr and then they got pushed onto the stack. (In little endian byte order!) Next we move the
fp - frame pointer to the new position of the stack pointer so that the called function can construct a stack frame that can hold its local variables if needed. After the work is done we pop back
lr and are safe to call
So let's rewrite our last hello world example a little bit:
say_hello: stp fp, lr, [sp, -16]! mov fp, sp mov x0, #1 adrp x1, msg@PAGE add x1, x1, msg@PAGEOFF adrp x2, msg_sz@PAGE add x2, x2, msg_sz@PAGEOFF ldr x2, [x2] mov x16, #4 svc 0x80 bl print_newline ldp fp, lr, [sp], 16 ret
So far so good!
Now let's write a
printf driven Hello World program. Since
printf uses variadic parameters we cannot use the registers
x1 - x7 for all but the first parameter. The call convention simply differs. The variadic parameters go on the stack. Let's see how this is done:
.globl _start .p2align 2 .equ variadic_param_1, 0 say_hello: stp fp, lr, [sp, #-16]! sub sp, sp, #16 mov fp, sp adrp x0, format_str@PAGE add x0, x0, format_str@PAGEOFF adrp x1, msg@PAGE add x1, x1, msg@PAGEOFF str x1, [fp, #variadic_param_1] bl _printf add sp, sp, #16 ldp fp, lr, [sp], #16 ret _start: stp fp, lr, [sp, #-16]! mov fp, sp bl say_hello ldp fp, lr, [sp], #16 ret format_str: .asciz "%s\n" .data msg: .asciz "Hello World!"
equ directive gives a symbolic name to a numeric constant. We will reserve some space on the stack for the variadic parameters, and the first one will go in the first bucket, hence the 0 offset. After storing
lr onto the stack we move the stack pointer and our frame pointer 16 bytes further. This will give us room for 2 64 bit values. We only need one, but the
sp needs to be 16 byte aligned on Dariwn. After we loaded the address of
msg into the
x1 register we can save it to our stack-frame (which can hold 2 64bit values). Since that is where the stack-pointer points to, that's also where
printf will be looking for it's first variadic parameter if the format string requires it.
You can play with a second variadic parameter and make another symbolic name:
.equ variadic_param_2, 8 or just store the second value to our stack frame using:
str reg, [fp, #8] instead of
str reg, [fp, #variadic_param_2].