Showing posts with label x86 Assembly. Show all posts
Showing posts with label x86 Assembly. Show all posts

Saturday, August 4, 2018

Build OS from Scratch 1

I've been always curious as to how a computer works, all the way from the bottom level to the top level. We use computers everyday, so we are familiar with the user-level, i.e., the top-most level, but how many people in the world actually know what is going on in the most deep down level?

I took a course during my undergrad how a computer and OS works, but it has been too long since then, and I don't remember much. Furthermore, when I was taking that course, I lacked much of the necessary knowledge to really absorb the materials; I wasn't even familiar with the most basic shell commands, such as cp, mv, etc.

Now that I think about it, that course was really something I want to learn now; unfortunately, I can't access the course materials any more. Thankfully, there are abundant other resources that are accessible from simple Google search, so I am going to dive into these very low level materials one by one.

I will be starting a series of short blog post to summarize what I learn in my own words, starting with this one. For this post, I am referring to this excellent document.

Boot process looks for bootable sector in any available disk or media. The bootable sector is flagged by magic number 0xAA55 in the last two bytes. The boot sector refers to the first sector in the media.

Let's install qemu to emulate a computer, and nasm to compile assembly.
$ sudo apt-get install qeum nasm -y

Next, create a simple boot sector that prints 'Hello' by first creating assembly source file hello.asm
;
; A simple  boot  sector  that  prints a message  to the  screen  using a BIOS  routine.
;
mov ah, 0x0e    ; teletype mode (tty)
mov al, 'H'     ; char to write to
int 0x10        ; print char on sreen
mov al, 'e'
int 0x10
mov al, 'l'
int 0x10
mov al, 'l'
int 0x10
mov al, 'o'
int 0x10
jmp $           ; Jump to the  current  address (i.e.  forever).
;
; Padding  and  magic  BIOS  number.
;
times  510-($-$$) db 0  ; Pad  the  boot  sector  out  with  zeros
                        ; $ means address at the beginning of the line
                        ; $$ means address at the beginning of the session (file)
dw 0xaa55               ; Last  two  bytes  form  the  magic  number ,
; so BIOS  knows  we are a boot  sector.


and compile to binary format
$ nasm hello.asm -f bin -o hello.bin

For more details on int 0x10, refer to here.

To boot this sector, simply run
$ qemu-system-x86-64 hello.bin

To view the boot sector in HEX, run
$ od -t x1 -A n hello.bin

You should see it boots up successfully by printing out 'Hello'!!

Tuesday, August 16, 2016

Some Useful Commands in GDB

I would like to list common useful commands in gdb:

(gdb) b main
set up a break point at main function after function prologue

(gdb) b *main
set up a break point right at the address of main, so that when $pc points to *main, it will break

(gdb) delete 1
delete break point 1

(gdb) p/d i
print the content of variable i as a signed decimal

(gdb) p/u j
print the content of variable j as an unsigned decimal

(gdb) p/x k
print the content of variable k in hex

(gdb) x/5i $pc
print next 5 instructions in assembly

(gdb) x/s buffer
print string at address pointed by variable buffer

(gdb) x/xg 0x123456789abcdef0
print 64-bit value in hex at address 0x123456789abcdef0

(gdb) x/10wx {void*}$rbp
print 10 consecutive 32-bit values in hex starting from the address pointed by $rbp register

(gdb) info reg
examine all the register values

(gdb) r arg1 arg2
run the program with command-line argument arg1 and arg2

(gdb) si
execute one machine instruction; if it is a function call, step into the subroutine

(gdb) ni
execute one machine instruction; if it is a function call, do not step into the subroutine

(gdb) step
execute one line of C/C++ code; step into function

(gdb) next
execute one line of C/C++ code; step over function

(gdb) p func(arg1, arg2)
print return value by calling function func with arguments arg1 and arg2

(gdb) finish
continue execution until the end of current frame (i.e., subroutine)

(gdb) disass main
show assembly instructions of main function

(gdb) list main.c:37
display source code of main.c file at around line 37

(gdb) display/i $pc
keep displaying next machine instruction

(gdb) until 37
continue execution until the specified line

Sunday, August 14, 2016

Transition from C to Assembly: The Basics (x86 64bit)

In this post, I will go over how a simple C program translates to x86 64bit assembly.

Consider classical helloworld.c program below:


#include <stdio.h>
int main() {
  printf("hello world\n");
  return 0;
}

Let us compile it.
$ gcc -g helloworld.c

Now, we will output its assembly using gdb:
$ gdb a.out -q
(gdb) disass main
Dump of assembler code for function main:
   0x0000000100000f6b <+0>: push   rbp
   0x0000000100000f6c <+1>: mov    rbp,rsp
   0x0000000100000f6f <+4>: lea    rdi,[rip+0x2c]        # 0x100000fa2
   0x0000000100000f76 <+11>: call   0x100000f82
   0x0000000100000f7b <+16>: mov    eax,0x0
   0x0000000100000f80 <+21>: pop    rbp
   0x0000000100000f81 <+22>: ret
End of assembler dump.

Note that you may get different result, depending on your compiler and platform OS. I am compiling with gcc5.3.0 running on Mac OS X 64-bit. On Unix systems, the result should be very similar, although if you are running on 32-bit computer, you will be getting ebp, esp, etc instead of rbp, rsp, etc. Lastly, if you want to switch between Intel vs AT&T style assembly code, please refer to my previous post.

Let us go over each instruction step by step.
<+0> push rbp simply pushes the value of rbp, the frame pointer or base pointer onto the stack. rbp stores the base frame address for the function, and because main() is called, it is saving its previous base frame onto the stack, so that when main() returns, it can go back to the previous function. The previous function, of course, would be some system call that initiates the program.

<+1> mov rbp, rsp will move rsp into rbp. Now that rbp was saved from the previous instruction, we are now safe to store the current frame pointer into rbp. Note that rsp is the stack pointer, which stores the current address of the stack. The value of rsp will increment for each pop instruction and decrement for each push instruction---remember that stack grows from the high-address to low-address, so each push will decrement the address of the current stack pointer. After this instruction, rsp and rbp hold the same value, which is the base of the main() frame.

<+4> lea rdi, [rip+0x2c] will load the value 0x100000fa2 into rdi register. Note that this address contains what we want to print, "hello world". To see this, simply run
(gdb) x/s 0x0x100000fa2
0x100000fa2: "hello world"

<+11> call 0x100000f82 will call the appropriate system call to print out the string.

<+16> mov eax, 0x0 will move 0 to eax register, which will be return value from main().

<+21> pop rbp will pop the content of the memory address pointed by rsp into rbp. This content will simply be the previous value of rbp, pushed onto stack during instruction <+0>. Thus, it will restore the rbp value.

<+22> ret will finally return from the function by popping the value pointed by rsp into rip, which is the instruction pointer. Note that the caller who called main() has already stored this value onto the stack.

Friday, October 30, 2015

Setting gdb's Assembly Code as Intel or AT&T Style

When debugging with gdb, one often looks at the assembly code of the binary file. Here, I will show how to do this and change the output as either 'intel' or 'att' format.

I have a simple helloworld program a.out that I'd like to debug. To do this, enter
$ gdb -q a.out

To read the assembly code of the function main, I would type in
(gdb) disass main

However, the default is in the 'att' format, which isn't my taste. To switch to 'intel' format, I'd type in
(gdb) set disassembly-flavor intel

If you'd like to make this change permanent, simply create a file .gdbinit in your home directory with the command that you want gdb to run every time it initialises.
$ echo "set disassembly-flavor intel" > ~/.gdbinit

That's it! Obviously, you may replace 'intel' with 'att' if you want to switch back to 'att' format.