This page looks best with JavaScript enabled

MercuryOS Reading data from disk

 ·  🎃 kr0m

In order to load the OS from our bootloader, we will have to be able to read from the disk, store this information in RAM, and finally execute the instructions that we have loaded. In this article, we will learn how to read and load this data.

Before we begin, it is recommended that you read these previous articles:


Through the BIOS interrupt 0x13, we can access the disk.
To indicate the position from which we want to start reading data, we will use the Cylinder-Head-Sector (CHS) notation.

  • A hard disk has several platters, each of which has a head on each side of the platter
  • The platters have tracks, which are circles of different radii
  • The tracks are divided into sectors
  • The tracks of the same radius of the different platters make up what is called a cylinder

The necessary parameters for reading are:

*ah*: 0x02  
*al*: Número de sectores a leer (tiene que ser mayor que 0)  
  
*es:bx*: Segmento y offset de la memoria RAM donde guardar los datos leídos.  
  
*ch*: Número de cilindro  
*cl*: Número de sector donde empezar a leer  
  
*dh*: Número de cabeza  

Disco a utilizar:
*dl* = 00h: Primera disquetera (Unidad "A:")  
*dl* = 01h: Segunda disquetera (Unidad "B:")  
*dl* = 80h: Primer disco duro  
*dl* = 81h: Segundo disco duro  
*dl* = FFh: Último disco duro soportado por la BIOS

As we can see, all the general-purpose registers are used, but to check that the correct number of bytes have been read, we need to save N (number of bytes to read) somewhere. Since there are no registers left, we will perform a little trick:

  • We will assign N to dh: mov dh, 2
  • When we reach the read function, we push the dx register onto the stack: push dx
  • We set dl/dh to indicate the disk and header to use: mov dl, 0x80/mov dh, 0x00
  • We perform the disk read
  • We pop the last value from the stack to dx: pop dx
  • We compare al with dh: cmp al, dh

This system is applicable to all records, we just need to remember the order in which we push them to later pop them into the correct record.

The final idea of all this is to read the disk’s OS, load it into RAM, and start executing the instructions we just loaded. But in our example, we will simply write data to the next two sectors of the disk and read them. It is worth noting that the first sector (512 bytes) is the bootloader itself that we are writing.

If an error occurs while reading the data, the cf register will be set to 1. We can check the error code in the ah register. Additionally, we can check that the correct number of sectors has been read by consulting the value of the al register.

In our code, we will write our bootloader in the first sector, fill the second with the values “dada,” and the third with the values “face.” Then we will read these sectors and leave the content in a memory position. Finally, we will read the last two bytes of each sector from RAM to check that the reading has been correct. However, these values are hexadecimal, and our screen printing function only prints ASCII characters from 0 to 255. We cannot pass al=AA int10 to it. To print them, we will have to do a hex-ASCII conversion.

vi boot_sect_main.asm
[org 0x7c00]
mov bp, 0x8000 ; set the stack safely away from us
mov sp, bp

mov bx, 0x9000 ; memory position where to save read data from disk
mov dh, 2 ; we need to have a copy of the sectors to read in dh registers
; in that way we will be able to check against al after read operation and check if we readed the correct number of sectors
call disk_load

mov dx, [0x9000] ; retrieve the first sector after boot sector, 0xdada
call print_hex
call print_nl

mov dx, [0x9000 + 512] ; retrieve the second sector after boot sector, 0xface
call print_hex

jmp $

%include "./boot_sect_print.asm"
%include "./boot_sect_print_hex.asm"
%include "./boot_sect_disk.asm"

; Magic number
times 510 - ($-$$) db 0
dw 0xaa55

; write two sectors with distinct data each one
times 256 dw 0xdada ; sector 2 = 512 bytes
times 256 dw 0xface ; sector 3 = 512 bytes
vi boot_sect_disk.asm
; load 'dh' sectors from drive 'dl' into ES:BX
disk_load:
    pusha; save all registers to stack before executing function
    push dx; save dx to stack, we are goig to use dl/dh meanwhile

    mov ah, 0x02 ; read from disk action when int13 is fired
    
    mov dl, 0x80 ; use first hard disk
    mov al, dh   ; number of sectors to read
    mov dh, 0x00 ; use first header
    mov ch, 0x00 ; read from first cilinder(track)
    mov cl, 0x02 ; sector number to start reading

    int 0x13      ; BIOS interrupt
    jc disk_error ; if error (stored in the carry bit)

    pop dx; recover dx content from stack
    cmp al, dh    ; BIOS also sets 'al' to the # of sectors read. Compare it.
    jne sectors_error
    
    popa; restore registers state
    ret


disk_error:
    mov bx, DISK_ERROR
    call print
    call print_nl
    mov dh, ah ; ah = error code, dl = disk drive that dropped the error
    call print_hex ; check out the code at http://stanislavs.org/helppc/int_13-1.html
    jmp disk_loop

sectors_error:
    mov bx, SECTORS_ERROR
    call print

disk_loop:
    jmp $

DISK_ERROR: db "Disk read error", 0
SECTORS_ERROR: db "Incorrect number of sectors read", 0
vi boot_sect_print_hex.asm
; memory address to be readed is stored in dx register
print_hex:
    pusha

    mov cx, 0 ; loop control counter

; Strategy: get the last char of 'dx', then convert to ASCII
; Numeric ASCII values: '0' (ASCII 0x30) to '9' (0x39), so just add 0x30 to byte N.
; For alphabetic characters A-F: 'A' (ASCII 0x41) to 'F' (0x46) we'll add 0x40
; Then, move the ASCII byte to the correct position on the resulting string
hex_loop:
    cmp cx, 4 ; loop 4 times, we only print last 4 bytes of the sector
    je end
    
    ; 1. convert last char of 'dx' to ascii
    mov ax, dx ; we will use 'ax' as our working register
    and ax, 0x000f ; 0x1234 -> 0x0004 by masking first three to zeros
    add al, 0x30 ; add 0x30 to N to convert it to ASCII "N"
    cmp al, 0x39 ; if > 9, add extra 8 to represent 'A' to 'F'
    jle step2
    add al, 7 ; 'A' is ASCII 65 instead of 58, so 65-58=7

step2:
    ; 2. get the correct position of the string to place our ASCII char
    ; bx <- base address + string length - index of char
    mov bx, HEX_OUT + 5 ; base + length
    sub bx, cx  ; our index variable
    mov [bx], al ; copy the ASCII char on 'al' to the position pointed by 'bx'
    ror dx, 4 ; 0x1234 -> 0x4123 -> 0x3412 -> 0x2341 -> 0x1234

    ; increment index and loop
    add cx, 1
    jmp hex_loop

end:
    ; prepare the parameter and call the function
    ; remember that print receives parameters in 'bx'
    mov bx, HEX_OUT
    call print

    popa
    ret

HEX_OUT:
    db '0x0000',0 ; reserve memory for our new string
vi boot_sect_print.asm
; ---- print function ----
print:
    pusha

; keep this in mind:
; while (string[i] != 0) { print string[i]; i++ }

; the comparison for string end (null byte)
start:
    mov al, [bx] ; 'bx' is the base address for the current char
    cmp al, 0 
    je done

    mov ah, 0x0e; tty mode
    int 0x10 ; print char

    ; increment pointer to next byte and do next loop
    add bx, 1
    jmp start

done:
    popa
    ret
; ---- print function ----


; ---- print_nl function ----
print_nl:
    pusha
    
    mov ah, 0x0e
    mov al, 0x0a ; newline char
    int 0x10
    mov al, 0x0d ; carriage return
    int 0x10
    
    popa
    ret
; ---- print_nl function ----

We generate the image:

nasm -f bin boot_sect_main.asm -o boot_sect_main.bin

If we dump the contents of our hard disk image, we can clearly see the boot sector followed by our second and third sectors:

xxd boot_sect_main.bin

00000000: bd00 8089 ecbb 0090 b602 e866 008b 1600  ...........f....  
00000010: 90e8 2b00 e81b 008b 1600 92e8 2100 ebfe  ..+.........!...  
00000020: 608a 073c 0074 09b4 0ecd 1083 c301 ebf1  `..<.t..........  
00000030: 61c3 60b4 0eb0 0acd 10b0 0dcd 1061 c360  a.`..........a.`  
00000040: b900 0083 f904 741c 89d0 83e0 0f04 303c  ......t.......0<  
00000050: 397e 0204 07bb 717c 29cb 8807 c1ca 0483  9~....q|).......  
00000060: c101 ebdf bb6c 7ce8 b6ff 61c3 3078 3030  .....l|...a.0x00  
00000070: 3030 0060 52b4 02b2 8088 f0b6 00b5 00b1  00.`R...........  
00000080: 02cd 1372 075a 38f0 7512 61c3 bba4 7ce8  ...r.Z8.u.a...|.  
00000090: 8eff e89d ff88 e6e8 a5ff eb06 bbb4 7ce8  ..............|.  
000000a0: 7eff ebfe 4469 736b 2072 6561 6420 6572  ~...Disk read er  
000000b0: 726f 7200 496e 636f 7272 6563 7420 6e75  ror.Incorrect nu  
000000c0: 6d62 6572 206f 6620 7365 6374 6f72 7320  mber of sectors   
000000d0: 7265 6164 0000 0000 0000 0000 0000 0000  read............  
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000110: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000140: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000170: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
00000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000001b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000001c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................  
000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa  ..............U.  
00000200: dada dada dada dada dada dada dada dada  ................  
00000210: dada dada dada dada dada dada dada dada  ................  
00000220: dada dada dada dada dada dada dada dada  ................  
00000230: dada dada dada dada dada dada dada dada  ................  
00000240: dada dada dada dada dada dada dada dada  ................  
00000250: dada dada dada dada dada dada dada dada  ................  
00000260: dada dada dada dada dada dada dada dada  ................  
00000270: dada dada dada dada dada dada dada dada  ................  
00000280: dada dada dada dada dada dada dada dada  ................  
00000290: dada dada dada dada dada dada dada dada  ................  
000002a0: dada dada dada dada dada dada dada dada  ................  
000002b0: dada dada dada dada dada dada dada dada  ................  
000002c0: dada dada dada dada dada dada dada dada  ................  
000002d0: dada dada dada dada dada dada dada dada  ................  
000002e0: dada dada dada dada dada dada dada dada  ................  
000002f0: dada dada dada dada dada dada dada dada  ................  
00000300: dada dada dada dada dada dada dada dada  ................  
00000310: dada dada dada dada dada dada dada dada  ................  
00000320: dada dada dada dada dada dada dada dada  ................  
00000330: dada dada dada dada dada dada dada dada  ................  
00000340: dada dada dada dada dada dada dada dada  ................  
00000350: dada dada dada dada dada dada dada dada  ................  
00000360: dada dada dada dada dada dada dada dada  ................  
00000370: dada dada dada dada dada dada dada dada  ................  
00000380: dada dada dada dada dada dada dada dada  ................  
00000390: dada dada dada dada dada dada dada dada  ................  
000003a0: dada dada dada dada dada dada dada dada  ................  
000003b0: dada dada dada dada dada dada dada dada  ................  
000003c0: dada dada dada dada dada dada dada dada  ................  
000003d0: dada dada dada dada dada dada dada dada  ................  
000003e0: dada dada dada dada dada dada dada dada  ................  
000003f0: dada dada dada dada dada dada dada dada  ................  
00000400: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000410: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000420: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000430: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000440: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000450: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000460: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000470: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000480: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000490: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000004a0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000004b0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000004c0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000004d0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000004e0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000004f0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000500: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000510: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000520: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000530: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000540: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000550: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000560: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000570: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000580: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
00000590: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000005a0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000005b0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000005c0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000005d0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000005e0: cefa cefa cefa cefa cefa cefa cefa cefa  ................  
000005f0: cefa cefa cefa cefa cefa cefa cefa cefa  ................

The xxd command shows any ASCII value on the right. As “da,” “cd,” and “fa” are not values between 0-255, it is not able to show them. This is precisely the same problem we had, and why we had to convert these values to ASCII.

We load the image into Qemu:

qemu-system-x86_64 boot_sect_main.bin

SeaBIOS (version rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org)  
iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+07F91410+07EF1410 C980  
  
Booting from Hard Disk...  
0xDADA  
0xFACE

Now we read from the disk and save what we read into a part of the RAM.

If you liked the article, you can treat me to a RedBull here