MercuryOS Pointers in C

C allows us to use pointers, with these variables we can store memory addresses instead of data directly, later we can refer to these memory addresses to read from or write to them.

Before we start, it is recommended that you read these previous articles:

Technically speaking, all pointers are 32-bit memory addresses, but depending on the data type we are going to read/write at that memory address, we will have to define our pointer in one way or another.

The definition of a pointer to the video memory address would be:

char* video_address = 0xb8000;

If we want to write to the memory address pointed by the pointer:

*video_address = 'X';

Let’s see how a pointer is translated to ASM:

vi test2.c

void my_function();

int main(int argc, char *argv[])
{
 my_function();
}

void my_function() {
 int var1 = 9;
 int* pointer = &var1;
}

We define an integer variable var1, assign a value to it, and then define a pointer that points to the memory position where the value of var1 is.

We compile:

cc -g test2.c -o test2

We load it into GDG:

gdb -q test2

Let’s see our function in ASM:

gef➤  disassemble my_function
Dump of assembler code for function my_function:
=> 0x0000000000201300 <+0>: push rbp
 0x0000000000201301 <+1>: mov rbp,rsp
 0x0000000000201304 <+4>: mov DWORD PTR [rbp-0x4],0x9
 0x000000000020130b <+11>: lea rax,[rbp-0x4]
 0x000000000020130f <+15>: mov QWORD PTR [rbp-0x10],rax
 0x0000000000201313 <+19>: pop rbp
 0x0000000000201314 <+20>: ret 
End of assembler dump.

We set a breakpoint just before calling our function:

gef➤  l
1 void my_function();
2 
3 int main(int argc, char *argv[])
4 {
5 my_function();
6 }
7 
8 void my_function() {
9 int var1 = 9;
10 int* pointer = &var1;
gef➤  b 5

We run the program:

gef➤  r

We take a few more steps with the “si” command until we reach:

 0x0000000000201304 <+4>: mov DWORD PTR [rbp-0x4],0x9

With this instruction, we are assigning 9 to the memory address bp-0x4, “creating” the variable var1 in the stack frame.
We move forward:

gef➤  si
 0x000000000020130b <+11>: lea rax,[rbp-0x4]

With this instruction, we are saving the address bp-0x4 in the ax register. This address is where the value of var1 is located, just the pointer definition.
We move forward:

gef➤  si
 → 0x20130f <my_function+15> mov QWORD PTR [rbp-0x10], rax

With this instruction, we are “creating” the pointer variable with the value of ax. If we check the value of the pointer (bp-0x10).

gef➤  x/10x $rbp-0x10
0x7fffffffe0e0: 0xffffe0ec 0x00007fff 0x00000001 0x00000001
0x7fffffffe0f0: 0xffffe110 0x00007fff 0x002012f4 0x00000000
0x7fffffffe100: 0xffffe180 0x00007fff

The address is: 0x7fffffffe0ec
If we check the content of that address:

gef➤  x/10x 0x7fffffffe0ec-0x10
0x7fffffffe0dc: 0x00000008 0xffffe0ec 0x00007fff 0x00000001
0x7fffffffe0ec: 0x00000009 0xffffe110 0x00007fff 0x002012f4
0x7fffffffe0fc: 0x00000000 0xffffe180

We get 9: 0x00000009.

Pointers can be defined directly to the memory address where the compiler stores the variable. The following example will create a pointer with the value of the memory address where the compiler has decided to put the string.

vi test3.c

void my_function();

int main(int argc, char *argv[])
{
	my_function();
}

void my_function() {
	char* my_string = "Hello";
}

We compile and load it into GDB:

cc -g test3.c -o test3
gdb -q test3

The ASM code of our function is as follows:

gef➤  disassemble my_function
Dump of assembler code for function my_function:
 0x0000000000201300 <+0>: push rbp
 0x0000000000201301 <+1>: mov rbp,rsp
 0x0000000000201304 <+4>: movabs rax,0x200490
 0x000000000020130e <+14>: mov QWORD PTR [rbp-0x8],rax
 0x0000000000201312 <+18>: pop rbp
 0x0000000000201313 <+19>: ret 
End of assembler dump.

The memory address where our string is stored has been calculated by the compiler: 0x200490. Let’s explain step by step how it generates the pointer.

 0x0000000000201304 <+4>: movabs rax,0x200490

It assigns the memory address where our string is located to the ax register.

 0x000000000020130e <+14>: mov QWORD PTR [rbp-0x8],rax

Create the variable in the stack frame and assign it the value of ax.

If we run the program step by step using the “si” command and stop at:

 0x0000000000201312 <+18>: pop rbp

We can check the value of the variable:

gef➤  x/10x $rbp-0x8
0x7fffffffe0e8: 0x00200490 0x00000000 0xffffe110 0x00007fff
0x7fffffffe0f8: 0x002012f4 0x00000000 0xffffe180 0x00007fff
0x7fffffffe108: 0xffffe190 0x00000001

The pointer value is: 0x00200490

We check that memory address:

gef➤  x/10x 0x00200490-10
0x200486: 0x00000000 0x00000000 0x65480000 0x006f6c6c
0x200496: 0x1b010000 0x00343b03 0x00050000 0x0b680000
0x2004a6: 0x00500000 0x0c880000

Hello in hexadecimal is 48 65 6c 6c 6f, in the memory dump we can see it: 0x65480000 0x006f6c6c

Another perhaps faster way to see it is since ax has been copied to bp-0x8 we can directly check ax obtaining the value of the memory address it points to in ASCII:

gef➤  registers $rax
$rax : 0x0000000000200490 → 0x0000006f6c6c6548 ("Hello"?)

The above output indicates that ax has a value of 0x00200490 and at this memory address we find the value 0x6f6c6c6548 which in ASCII is Hello.

In short, a pointer is a variable that does not maintain a value in itself, but rather the memory address where the final value that we are really interested in is located.