In order to load the OS from our bootloader, we will have to be able to read from the disk, store this information in RAM, and finally execute the instructions that we have loaded. In this article, we will learn how to read and load this data.
Before we begin, it is recommended that you read these previous articles:
Through the BIOS interrupt 0x13, we can access the disk.
To indicate the position from which we want to start reading data, we will use the Cylinder-Head-Sector (CHS) notation.
- A hard disk has several platters, each of which has a head on each side of the platter
- The platters have tracks, which are circles of different radii
- The tracks are divided into sectors
- The tracks of the same radius of the different platters make up what is called a cylinder
The necessary parameters for reading are:
*ah*: 0x02
*al*: Número de sectores a leer (tiene que ser mayor que 0)
*es:bx*: Segmento y offset de la memoria RAM donde guardar los datos leídos.
*ch*: Número de cilindro
*cl*: Número de sector donde empezar a leer
*dh*: Número de cabeza
Disco a utilizar:
*dl* = 00h: Primera disquetera (Unidad "A:")
*dl* = 01h: Segunda disquetera (Unidad "B:")
*dl* = 80h: Primer disco duro
*dl* = 81h: Segundo disco duro
*dl* = FFh: Último disco duro soportado por la BIOS
As we can see, all the general-purpose registers are used, but to check that the correct number of bytes have been read, we need to save N (number of bytes to read) somewhere. Since there are no registers left, we will perform a little trick:
- We will assign N to dh: mov dh, 2
- When we reach the read function, we push the dx register onto the stack: push dx
- We set dl/dh to indicate the disk and header to use: mov dl, 0x80/mov dh, 0x00
- We perform the disk read
- We pop the last value from the stack to dx: pop dx
- We compare al with dh: cmp al, dh
This system is applicable to all records, we just need to remember the order in which we push them to later pop them into the correct record.
The final idea of all this is to read the disk’s OS, load it into RAM, and start executing the instructions we just loaded. But in our example, we will simply write data to the next two sectors of the disk and read them. It is worth noting that the first sector (512 bytes) is the bootloader itself that we are writing.
If an error occurs while reading the data, the cf register will be set to 1. We can check the error code in the ah register. Additionally, we can check that the correct number of sectors has been read by consulting the value of the al register.
In our code, we will write our bootloader in the first sector, fill the second with the values “dada,” and the third with the values “face.” Then we will read these sectors and leave the content in a memory position. Finally, we will read the last two bytes of each sector from RAM to check that the reading has been correct. However, these values are hexadecimal, and our screen printing function only prints ASCII characters from 0 to 255. We cannot pass al=AA int10 to it. To print them, we will have to do a hex-ASCII conversion.
[org 0x7c00]
mov bp, 0x8000 ; set the stack safely away from us
mov sp, bp
mov bx, 0x9000 ; memory position where to save read data from disk
mov dh, 2 ; we need to have a copy of the sectors to read in dh registers
; in that way we will be able to check against al after read operation and check if we readed the correct number of sectors
call disk_load
mov dx, [0x9000] ; retrieve the first sector after boot sector, 0xdada
call print_hex
call print_nl
mov dx, [0x9000 + 512] ; retrieve the second sector after boot sector, 0xface
call print_hex
jmp $
%include "./boot_sect_print.asm"
%include "./boot_sect_print_hex.asm"
%include "./boot_sect_disk.asm"
; Magic number
times 510 - ($-$$) db 0
dw 0xaa55
; write two sectors with distinct data each one
times 256 dw 0xdada ; sector 2 = 512 bytes
times 256 dw 0xface ; sector 3 = 512 bytes
; load 'dh' sectors from drive 'dl' into ES:BX
disk_load:
pusha; save all registers to stack before executing function
push dx; save dx to stack, we are goig to use dl/dh meanwhile
mov ah, 0x02 ; read from disk action when int13 is fired
mov dl, 0x80 ; use first hard disk
mov al, dh ; number of sectors to read
mov dh, 0x00 ; use first header
mov ch, 0x00 ; read from first cilinder(track)
mov cl, 0x02 ; sector number to start reading
int 0x13 ; BIOS interrupt
jc disk_error ; if error (stored in the carry bit)
pop dx; recover dx content from stack
cmp al, dh ; BIOS also sets 'al' to the # of sectors read. Compare it.
jne sectors_error
popa; restore registers state
ret
disk_error:
mov bx, DISK_ERROR
call print
call print_nl
mov dh, ah ; ah = error code, dl = disk drive that dropped the error
call print_hex ; check out the code at http://stanislavs.org/helppc/int_13-1.html
jmp disk_loop
sectors_error:
mov bx, SECTORS_ERROR
call print
disk_loop:
jmp $
DISK_ERROR: db "Disk read error", 0
SECTORS_ERROR: db "Incorrect number of sectors read", 0
; memory address to be readed is stored in dx register
print_hex:
pusha
mov cx, 0 ; loop control counter
; Strategy: get the last char of 'dx', then convert to ASCII
; Numeric ASCII values: '0' (ASCII 0x30) to '9' (0x39), so just add 0x30 to byte N.
; For alphabetic characters A-F: 'A' (ASCII 0x41) to 'F' (0x46) we'll add 0x40
; Then, move the ASCII byte to the correct position on the resulting string
hex_loop:
cmp cx, 4 ; loop 4 times, we only print last 4 bytes of the sector
je end
; 1. convert last char of 'dx' to ascii
mov ax, dx ; we will use 'ax' as our working register
and ax, 0x000f ; 0x1234 -> 0x0004 by masking first three to zeros
add al, 0x30 ; add 0x30 to N to convert it to ASCII "N"
cmp al, 0x39 ; if > 9, add extra 8 to represent 'A' to 'F'
jle step2
add al, 7 ; 'A' is ASCII 65 instead of 58, so 65-58=7
step2:
; 2. get the correct position of the string to place our ASCII char
; bx <- base address + string length - index of char
mov bx, HEX_OUT + 5 ; base + length
sub bx, cx ; our index variable
mov [bx], al ; copy the ASCII char on 'al' to the position pointed by 'bx'
ror dx, 4 ; 0x1234 -> 0x4123 -> 0x3412 -> 0x2341 -> 0x1234
; increment index and loop
add cx, 1
jmp hex_loop
end:
; prepare the parameter and call the function
; remember that print receives parameters in 'bx'
mov bx, HEX_OUT
call print
popa
ret
HEX_OUT:
db '0x0000',0 ; reserve memory for our new string
; ---- print function ----
print:
pusha
; keep this in mind:
; while (string[i] != 0) { print string[i]; i++ }
; the comparison for string end (null byte)
start:
mov al, [bx] ; 'bx' is the base address for the current char
cmp al, 0
je done
mov ah, 0x0e; tty mode
int 0x10 ; print char
; increment pointer to next byte and do next loop
add bx, 1
jmp start
done:
popa
ret
; ---- print function ----
; ---- print_nl function ----
print_nl:
pusha
mov ah, 0x0e
mov al, 0x0a ; newline char
int 0x10
mov al, 0x0d ; carriage return
int 0x10
popa
ret
; ---- print_nl function ----
We generate the image:
If we dump the contents of our hard disk image, we can clearly see the boot sector followed by our second and third sectors:
00000000: bd00 8089 ecbb 0090 b602 e866 008b 1600 ...........f....
00000010: 90e8 2b00 e81b 008b 1600 92e8 2100 ebfe ..+.........!...
00000020: 608a 073c 0074 09b4 0ecd 1083 c301 ebf1 `..<.t..........
00000030: 61c3 60b4 0eb0 0acd 10b0 0dcd 1061 c360 a.`..........a.`
00000040: b900 0083 f904 741c 89d0 83e0 0f04 303c ......t.......0<
00000050: 397e 0204 07bb 717c 29cb 8807 c1ca 0483 9~....q|).......
00000060: c101 ebdf bb6c 7ce8 b6ff 61c3 3078 3030 .....l|...a.0x00
00000070: 3030 0060 52b4 02b2 8088 f0b6 00b5 00b1 00.`R...........
00000080: 02cd 1372 075a 38f0 7512 61c3 bba4 7ce8 ...r.Z8.u.a...|.
00000090: 8eff e89d ff88 e6e8 a5ff eb06 bbb4 7ce8 ..............|.
000000a0: 7eff ebfe 4469 736b 2072 6561 6420 6572 ~...Disk read er
000000b0: 726f 7200 496e 636f 7272 6563 7420 6e75 ror.Incorrect nu
000000c0: 6d62 6572 206f 6620 7365 6374 6f72 7320 mber of sectors
000000d0: 7265 6164 0000 0000 0000 0000 0000 0000 read............
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000100: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000110: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000120: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000130: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000140: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000150: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000160: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000170: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000180: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000190: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U.
00000200: dada dada dada dada dada dada dada dada ................
00000210: dada dada dada dada dada dada dada dada ................
00000220: dada dada dada dada dada dada dada dada ................
00000230: dada dada dada dada dada dada dada dada ................
00000240: dada dada dada dada dada dada dada dada ................
00000250: dada dada dada dada dada dada dada dada ................
00000260: dada dada dada dada dada dada dada dada ................
00000270: dada dada dada dada dada dada dada dada ................
00000280: dada dada dada dada dada dada dada dada ................
00000290: dada dada dada dada dada dada dada dada ................
000002a0: dada dada dada dada dada dada dada dada ................
000002b0: dada dada dada dada dada dada dada dada ................
000002c0: dada dada dada dada dada dada dada dada ................
000002d0: dada dada dada dada dada dada dada dada ................
000002e0: dada dada dada dada dada dada dada dada ................
000002f0: dada dada dada dada dada dada dada dada ................
00000300: dada dada dada dada dada dada dada dada ................
00000310: dada dada dada dada dada dada dada dada ................
00000320: dada dada dada dada dada dada dada dada ................
00000330: dada dada dada dada dada dada dada dada ................
00000340: dada dada dada dada dada dada dada dada ................
00000350: dada dada dada dada dada dada dada dada ................
00000360: dada dada dada dada dada dada dada dada ................
00000370: dada dada dada dada dada dada dada dada ................
00000380: dada dada dada dada dada dada dada dada ................
00000390: dada dada dada dada dada dada dada dada ................
000003a0: dada dada dada dada dada dada dada dada ................
000003b0: dada dada dada dada dada dada dada dada ................
000003c0: dada dada dada dada dada dada dada dada ................
000003d0: dada dada dada dada dada dada dada dada ................
000003e0: dada dada dada dada dada dada dada dada ................
000003f0: dada dada dada dada dada dada dada dada ................
00000400: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000410: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000420: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000430: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000440: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000450: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000460: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000470: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000480: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000490: cefa cefa cefa cefa cefa cefa cefa cefa ................
000004a0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000004b0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000004c0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000004d0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000004e0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000004f0: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000500: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000510: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000520: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000530: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000540: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000550: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000560: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000570: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000580: cefa cefa cefa cefa cefa cefa cefa cefa ................
00000590: cefa cefa cefa cefa cefa cefa cefa cefa ................
000005a0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000005b0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000005c0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000005d0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000005e0: cefa cefa cefa cefa cefa cefa cefa cefa ................
000005f0: cefa cefa cefa cefa cefa cefa cefa cefa ................
The xxd command shows any ASCII value on the right. As “da,” “cd,” and “fa” are not values between 0-255, it is not able to show them. This is precisely the same problem we had, and why we had to convert these values to ASCII.
We load the image into Qemu:
SeaBIOS (version rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org)
iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+07F91410+07EF1410 C980
Booting from Hard Disk...
0xDADA
0xFACE
Now we read from the disk and save what we read into a part of the RAM.