C allows us to use pointers, with these variables we can store memory addresses instead of data directly, later we can refer to these memory addresses to read from or write to them.
Before we start, it is recommended that you read these previous articles:
- Boot Sector
- Interrupts
- Memory
- Stack
- IF-ELSE
- Functions
- Memory Segmentation
- Reading Data from Disk
- Entering Protected Mode 32bits
- Compilation, Linking, Stack Management, and Variables in C
Technically speaking, all pointers are 32-bit memory addresses, but depending on the data type we are going to read/write at that memory address, we will have to define our pointer in one way or another.
The definition of a pointer to the video memory address would be:
char* video_address = 0xb8000;
If we want to write to the memory address pointed by the pointer:
*video_address = 'X';
Let’s see how a pointer is translated to ASM:
void my_function();
int main(int argc, char *argv[])
{
my_function();
}
void my_function() {
int var1 = 9;
int* pointer = &var1;
}
We define an integer variable var1, assign a value to it, and then define a pointer that points to the memory position where the value of var1 is.
We compile:
We load it into GDG:
Let’s see our function in ASM:
gef➤ disassemble my_function
Dump of assembler code for function my_function:
=> 0x0000000000201300 <+0>: push rbp
0x0000000000201301 <+1>: mov rbp,rsp
0x0000000000201304 <+4>: mov DWORD PTR [rbp-0x4],0x9
0x000000000020130b <+11>: lea rax,[rbp-0x4]
0x000000000020130f <+15>: mov QWORD PTR [rbp-0x10],rax
0x0000000000201313 <+19>: pop rbp
0x0000000000201314 <+20>: ret
End of assembler dump.
We set a breakpoint just before calling our function:
gef➤ l
1 void my_function();
2
3 int main(int argc, char *argv[])
4 {
5 my_function();
6 }
7
8 void my_function() {
9 int var1 = 9;
10 int* pointer = &var1;
gef➤ b 5
We run the program:
gef➤ r
We take a few more steps with the “si” command until we reach:
0x0000000000201304 <+4>: mov DWORD PTR [rbp-0x4],0x9
With this instruction, we are assigning 9 to the memory address bp-0x4, “creating” the variable var1 in the stack frame.
We move forward:
gef➤ si
0x000000000020130b <+11>: lea rax,[rbp-0x4]
With this instruction, we are saving the address bp-0x4 in the ax register. This address is where the value of var1 is located, just the pointer definition.
We move forward:
gef➤ si
→ 0x20130f <my_function+15> mov QWORD PTR [rbp-0x10], rax
With this instruction, we are “creating” the pointer variable with the value of ax. If we check the value of the pointer (bp-0x10).
gef➤ x/10x $rbp-0x10
0x7fffffffe0e0: 0xffffe0ec 0x00007fff 0x00000001 0x00000001
0x7fffffffe0f0: 0xffffe110 0x00007fff 0x002012f4 0x00000000
0x7fffffffe100: 0xffffe180 0x00007fff
The address is: 0x7fffffffe0ec
If we check the content of that address:
gef➤ x/10x 0x7fffffffe0ec-0x10
0x7fffffffe0dc: 0x00000008 0xffffe0ec 0x00007fff 0x00000001
0x7fffffffe0ec: 0x00000009 0xffffe110 0x00007fff 0x002012f4
0x7fffffffe0fc: 0x00000000 0xffffe180
We get 9: 0x00000009.
Pointers can be defined directly to the memory address where the compiler stores the variable. The following example will create a pointer with the value of the memory address where the compiler has decided to put the string.
void my_function();
int main(int argc, char *argv[])
{
my_function();
}
void my_function() {
char* my_string = "Hello";
}
We compile and load it into GDB:
gdb -q test3
The ASM code of our function is as follows:
gef➤ disassemble my_function
Dump of assembler code for function my_function:
0x0000000000201300 <+0>: push rbp
0x0000000000201301 <+1>: mov rbp,rsp
0x0000000000201304 <+4>: movabs rax,0x200490
0x000000000020130e <+14>: mov QWORD PTR [rbp-0x8],rax
0x0000000000201312 <+18>: pop rbp
0x0000000000201313 <+19>: ret
End of assembler dump.
The memory address where our string is stored has been calculated by the compiler: 0x200490. Let’s explain step by step how it generates the pointer.
0x0000000000201304 <+4>: movabs rax,0x200490
It assigns the memory address where our string is located to the ax register.
0x000000000020130e <+14>: mov QWORD PTR [rbp-0x8],rax
Create the variable in the stack frame and assign it the value of ax.
If we run the program step by step using the “si” command and stop at:
0x0000000000201312 <+18>: pop rbp
We can check the value of the variable:
gef➤ x/10x $rbp-0x8
0x7fffffffe0e8: 0x00200490 0x00000000 0xffffe110 0x00007fff
0x7fffffffe0f8: 0x002012f4 0x00000000 0xffffe180 0x00007fff
0x7fffffffe108: 0xffffe190 0x00000001
The pointer value is: 0x00200490
We check that memory address:
gef➤ x/10x 0x00200490-10
0x200486: 0x00000000 0x00000000 0x65480000 0x006f6c6c
0x200496: 0x1b010000 0x00343b03 0x00050000 0x0b680000
0x2004a6: 0x00500000 0x0c880000
Hello in hexadecimal is 48 65 6c 6c 6f, in the memory dump we can see it: 0x65480000 0x006f6c6c
Another perhaps faster way to see it is since ax has been copied to bp-0x8 we can directly check ax obtaining the value of the memory address it points to in ASCII:
gef➤ registers $rax
$rax : 0x0000000000200490 → 0x0000006f6c6c6548 ("Hello"?)
The above output indicates that ax has a value of 0x00200490 and at this memory address we find the value 0x6f6c6c6548 which in ASCII is Hello.
In short, a pointer is a variable that does not maintain a value in itself, but rather the memory address where the final value that we are really interested in is located.