This page looks best with JavaScript enabled

Hex <-> Dec & Video Memory to ASCII string interpreter

 ·  🎃 kr0m

There are several needs that arise in the process of programming an OS, the first of which is the conversion between decimal and hexadecimal bases, the other is the interpretation of hexadecimal characters to ASCII. In this article we will see a Python script that covers both of the aforementioned needs.

The script in question is as follows:

#!/usr/bin/env python
import os
import re
from textwrap import wrap

os.system('clear')
print('----------------------------------------------------------')
print('| Hex <-> Dec & Video Memory to ASCII string interpreter |')
print('----------------------------------------------------------')
print('')

print('-- Select one option:')
print('1- Convert number')
print('2- Interpret Hex string as ASCII')
action = int(input(''))

if action == 1:
    print('')
    print('-- Select input base:')
    print('1- Decimal')
    print('2- Hexadecimal')
    base = int(input(''))
    
    if base != 1 and base != 2:
        print('++ ERROR: Incorrect base selected')
        exit()

    while True:
        print('')
        print('-- Introduce number, Ctrl+c to exit:')
        number = input('')

        if base == 1:
            hexadecimal = hex(int(number))
            print('Hex: {0}'.format(hexadecimal))
        elif base == 2:
            decimal = int(number, 16)
            print('Dec: {0}'.format(decimal))
        else:
            print('++ ERROR: Incorrect base selected')
            exit()
            
elif action == 2:
    while True:
        print('')
        print('-- Introduce Memory chars in GDB format, Ctrl+c to exit:')
        print("Enter/Paste your content. Ctrl-D to process it.")
        content = []
        while True:
            try:
                line = input()
            except EOFError:
                break
            content.append(line)
            
        print('--------------')
        #print(content)
        finalText = []
        for hexdata in content:
            asciiString = []
            asciiStringText = []
            memoryAddress = hexdata.split(':')[0]
            #print('memoryAddress:')
            #print(memoryAddress)
            rawData = hexdata.split(':')[1]
            #print('rawData:')
            #print(rawData)
            patt = re.compile("[^\t]+")
            for data in patt.findall(rawData):
                asciiChar_Attr = data.split('0x')[1]
                isChar = False
                for byteToDecode in wrap(asciiChar_Attr, 2):
                    #print(charToDecode)
                    # We dont want to decode format chars
                    if isChar:
                        #print('Decoding: {0}'.format(byteToDecode))
                        try:
                            asciiChar = bytearray.fromhex(byteToDecode).decode()
                        except:
                            print('Error decoding char')
                            asciiChar = '_'
                            
                        asciiString.append(asciiChar)
                        asciiStringText.append(asciiChar)
                    else:
                        asciiString.append(byteToDecode)
                    
                    isChar = not isChar
            
            count = 0
            asciiStringTextSorted = []
            for char in asciiStringText:
                if (count % 2) == 0:
                    tempVar = char
                else:
                    asciiStringTextSorted.append(char)
                    asciiStringTextSorted.append(tempVar)
                    
                count = count + 1
                    
            asciiStringTextSortedString = ''.join(asciiStringTextSorted)
            finalText.append(asciiStringTextSortedString)
            #print('Processed Address: {0} -> {1}'.format(memoryAddress, asciiStringTextSortedString))
            
            #print('{0} {1}'.format(memoryAddress, asciiString))
            
            finalAsciiString = []
            count = 0
            finalAsciiString.append('\t')
            finalAsciiString.append('0x')
            for char in asciiString:
                finalAsciiString.append(char)
                if (count % 4) == 3 and count < 15:
                    finalAsciiString.append('\t')
                    finalAsciiString.append('0x')

                count = count + 1
                
            finalAsciiStringStringified = ''.join(finalAsciiString)
            print('{0}: {1}'.format(memoryAddress, finalAsciiStringStringified))
        
        print('--------------')
        print('')
        print('finalText: {0}'.format(finalText))

else:
    print('++ ERROR: Incorrect option introduced')

Its operation is very simple, on the one hand it has the conversion between bases that does not require explanation and on the other hand the interpretation of the characters in the video memory.

To interpret the characters in the video memory, we must first dump them in GDB, which would show us an output like this:

gef➤  x/20x 0xb8000  
0xb8000: 0x0f650f57 0x0f630f6c 0x0f6d0f6f 0x0f200f65  
0xb8010: 0x0f6f0f74 0x0b530f20 0x0b650b74 0x0b6c0b6c  
0xb8020: 0x0b740b61 0x0b720b6f 0x0e530e4f 0x0f620f20  
0xb8030: 0x0f200f79 0x0f720f4b 0x0f6d0f30 0x0f760f20  
0xb8040: 0x0f2e0f30 0x0f620f32 0x0f000f00 0x0f000f00

If we paste the previous memory content into the script, we get:

----------------------------------------------------------  
| Hex <-> Dec & Video Memory to ASCII string interpreter |  
----------------------------------------------------------  
  
-- Select one option:  
1- Convert number  
2- Interpret Hex string as ASCII  
2  
  
-- Introduce Memory chars in GDB format, Ctrl+c to exit:  
Enter/Paste your content. Ctrl-D to process it.  
0xb8000: 0x0f650f57 0x0f630f6c 0x0f6d0f6f 0x0f200f65  
0xb8010: 0x0f6f0f74 0x0b530f20 0x0b650b74 0x0b6c0b6c  
0xb8020: 0x0b740b61 0x0b720b6f 0x0e530e4f 0x0f620f20  
0xb8030: 0x0f200f79 0x0f720f4b 0x0f6d0f30 0x0f760f20  
0xb8040: 0x0f2e0f30 0x0f620f32 0x0f000f00 0x0f000f00  
--------------  
0xb8000:  0x0fe0fW 0x0fc0fl 0x0fm0fo 0x0f 0fe  
0xb8010:  0x0fo0ft 0x0bS0f  0x0be0bt 0x0bl0bl  
0xb8020:  0x0bt0ba 0x0br0bo 0x0eS0eO 0x0fb0f   
0xb8030:  0x0f 0fy 0x0fr0fK 0x0fm0f0 0x0fv0f   
0xb8040:  0x0f.0f0 0x0fb0f2 0x0f0f 0x0f0f  
--------------  
  
finalText: Welcome ', 'to Stell', 'atorOS b', 'y Kr0m v', '0.2b\x00\x00\x00\x00

As we can see, it has interpreted the characters and left the format bytes. Finally, it shows us only the characters in ASCII separated by the introduced memory ranges, for example, the position 0xb8000 corresponds to the text “Welcome “, the address 0xb8010 to “to Stell” and so on.

NOTE: This code can probably be integrated into GDB as a plugin.

If you liked the article, you can treat me to a RedBull here