Writing In Assembler x86 and aarch64 – Lab3 SP0600

Hello,

In this post, I am talking about how I am furthering my understanding of computers, so I can understand how I can properly optimize software. I am writing about learning how to write assembler code on the x86 and aarch64 platform for my software optimization class lab.

To complete this lab I performed the following tasks :

  1. Build and run the three C versions of the program for x86_64.
    Take a look at the differences in the code.
  2. Use the objdump -d command to dump (print) the object code (machine code) and disassemble it into assembler for each of the binaries. Find the section and take a look at the code. Notice the total amount of code.
  3. Review, build and run the x86_64 assembly language programs. Take a look at the code using objdump -d objectfile and compare it to the source code. Notice the absence of other code (compared to the C binary, which had a lot of extra code).
  4. Build and run the three C versions of the program for aarch64. Verify that you can disassemble the object code in the ELF binary using objdump -d objectfile and take a look at the code
  5. Review, build and run the aarch64 assembly language programs. Take a look at the code using objdump -d objectfile and compare it to the source code.
  6. Make a loop from 0 to 9, on x86 and aarch64
  7. Extend the code to loop from 00-30, printing each value as a 2-digit decimal number, on x86 and aarch64

How I used a Makefile

Since this lab required testing, reviewing, creating and running many files I decided to load everything into a Makefile.

In doing this I learned that I can call Makefiles in other folders.
The way I did that was by adding a target to the main Makefile and typing in “cd /route/to/makefile && make all”

In the attached folders you can see the Makefile I created.

Task 1

The three c programs all perform the same task of printing “Hello World!”, but they do it in 3 different ways.

Program 1: Uses printf()
Program 2: Uses write()
Program 3: Uses syscall()

Task 2

After Reviewing the output of the objdump I can see that program 1 uses the least amount of code at 8 lines but it is using printf which has the most overhead of the three functions. Program 2 using write which should have less overhead uses 12 lines of code. And finally program 3 also uses 12 lines of code and since we are using a syscall we have very little overhead.

Task 3

Yes, Since we are now compiling straight from assembler we don’t have the overhead of the c language. This cut the program down in size drastically now the how objdump file is only 11 lines of code.

Task 4

Here is the total line count the three c programs took to run on aarch64. Pretty similar results.

Program 1: 10 lines
Program 2: 12 lines
Program 3: 12 lines

Task 5

Surprisingly, the results are identical to the x86 in term of line count. The aarch64 Hello world program used 11 lines of code the same as x86.

Something interesting I noticed about the compiled code is that it transformed all the numbers to hexadecimal.

Task 6

Here is my loops 0-9 on x86 and aarch64.

/* x86 */
.text
.globl    _start

start = 0                       /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10                        /* loop exits when the index hits this number (loop condition is i<max) */

_start:
    mov     $start,%r15         /* loop index */
   
loop:
    /* ... body of the loop ... do something useful here ... */
    mov     $len,%rdx
    
    mov     $48,%r14
    add     %r15,%r14
    
    movb    %r14b,msg+6
    mov     $msg,%rsi

    mov     $1,%rdi
    mov     $1,%rax
    syscall 

    inc     %r15                /* increment index */
    cmp     $max,%r15           /* see if we're done */
    jne     loop                /* loop if we're not */

    mov     $0,%rdi             /* exit status */
    mov     $60,%rax            /* syscall sys_exit */
    syscall
.data 
msg: .ascii "Loop:  \n"
    len = . - msg
/* aarch64 */
.text
.globl    _start

start = 0                       /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10                        /* loop exits when the index hits this number (loop condition is i<max) */

_start:
    mov     x30,start           /* loop index */
    
loop:
    
    mov     x19,48
    mov     x26,max
    mov     x27,1
    adr     x28,msg

    add     x19,x30,x19

    strb     w19,[x28,6]
    ldr      x1,=msg
        
    mov     x0,1
    mov     x2,len
    mov     x8, 64
    svc     0

    add     x30,x27,x30             /* increment index */
    cmp     x26,x30                 /* see if we're done */
    b.ne    loop                   /* loop if we're not */

    mov     x8,93                   /* syscall sys_exit */
    svc     0

    .data

    msg: .ascii "Loop:      \n"
    len = . - msg

Task 7

Here is my loops 0-30 with the leading zero’s removed on x86 and aarch64.

/* x86 */
.text
.globl    _start

start = 0                       /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 31                        /* loop exits when the index hits this number (loop condition is i<max) */

_start:
    mov     $start,%r15         /* loop index */
    
loop:
    /* ... body of the loop ... do something useful here ... */
    
   
    mov     $48,%r13
    mov     $48,%r14
    mov     $0,%rdx

    mov     %r15,%rax
    mov     $10,%r12
    div     %r12

    
    add     %rax,%r13
    add     %rdx,%r14
    
    cmp     $48,%r13                   /*Compare*/
    
    je continue     
    
    movb    %r13b,msg+6

continue:

    movb    %r14b,msg+7
    mov     $msg,%rsi /*send message to reg rsi*/
        
    mov     $1,%rdi
    mov     $1,%rax
    mov     $len,%rdx

    syscall

    inc     %r15                /* increment index */
    cmp     $max,%r15           /* see if we're done */
    jne     loop                /* loop if we're not */

    mov     $0,%rdi             /* exit status */
    mov     $60,%rax            /* syscall sys_exit */
    syscall

    .data

    msg: .ascii "Loop:   \n"
    len = . - msg
/* aarch64 */
.text
.globl    _start

start = 0                       /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 30                        /* loop exits when the index hits this number (loop condition is i<max) */

_start:
    mov     x30,start           /* loop index */
    
loop:
    
    mov     x19,48
    mov     x20,48
    mov     x24,10  
    mov     x25,48
    mov     x26,max
    mov     x27,1
    adr     x28,msg

    
    udiv    x21,x30,x24
    msub    x22,x21,x24,x30

    add     x19,x21,x19
    add     x20,x22,x20

    cmp     x25,x19
    b.eq    continue     
    
    strb     w19,[x28,6]

continue:

    strb     w20,[x28,7]
    ldr      x1,=msg
        
    mov     x0,1
    mov     x2,len
    mov     x8, 64
    svc     0

    add     x30,x27,x30             /* increment index */
    cmp     x26,x30                 /* see if we're done */
    b.ne    loop                   /* loop if we're not */

    mov     x8,93                   /* syscall sys_exit */
    svc     0

    .data

    msg: .ascii "Loop:      \n"
    len = . - msg

Download my files