Review of macros

Single-line macros: %define or %assign for arithmetic “variable-like” macros.

Multi-line macros %macro/%endmacro

Conditions: %if/%elif/%else/%endif

Repetition: %rep/%endrep

A sample macro: building a WHILE/ENDWHILE loop:

WHILE rax, ge, 0
    ... ; loop body
ENDWHILE

Because the WHILE and ENDWHILE need to communicate with each other (ENDWHILE needs to know the label to jump to), we will have to use a context scope:

macro WHILE 3
    %push while
    %$while:
    cmp %1, %3
    j%-2 %$endwhile
endmacro

macro ENDWHILE 0
    jmp %$while
    %$endwhile:
    %pop
endmacro

A smarter ENDWHILE would check to make sure we are actually in a WHILE before popping the context stack.

Basics of building an operating system

We’ll spend the last two weeks of class hopefully working up to the point where we can boot our own “operating system” on a raw virtual machine. We’ll use QEMU for the virtual machine (it’s installed on the server, and it can communicate with GDB if we need to debug our code).

Boot process

There are two processes by which the system “boots” (i.e., transfers control from the hard-coded functionality on the system chips to the code defined on some storage medium), both of which involve running some code on a storage device (hard-drive, flash-drive, DVD, etc.):

We’ll stick with MBR, as that is easier for us.

MBR

An MBR-formatted disk has the first 512 bytes of the disk devoted to the master boot record. The MBR contains both the partition table, defining how the disk is divided up into up to four partitions, as well as the boot code, which is executed on system start up. Each partition, in turn, if it is marked as bootable, can have its own boot record, containing its own boot code. The typical behavior of the MBR is just to find the first bootable partition and then load and run its boot code, but its possible to do fancier things like display a menu of bootable partitions and such.

For our purposes, the boot code in the MBR will be the “operating system”; i.e., we’ll write the code we want to run directly into the boot code, rather than using a generic boot loader and writing our code into a partition’s boot record. This means that our “operating system” will be the only operating system allowed on the disk.

The MBR is limited to 512 bytes, but these 512 bytes include, just before the end, the partition table, defining what partitions are on the disk. We’ll leave this table blank (filled with 0s), as it won’t really matter for us, but you should be aware that its there. So technically we only have 440 bytes available in the MBR for our code.

Bootloader 0-439 (440 bytes)
Disk ID 440-443 (4 bytes)
reserved, must be 0 444-445 (2 bytes)
1st partition entry446-461 (16 bytes)
2nd partition entry462-477 (16 bytes)
3rd partition entry478-493 (16 bytes)
4th partition entry494-509 (16 bytes)
Signature, must be 0xaa55510-511

16-bit real mode

When the system starts up, it is running in 16-bit “real mode” for compatibility with older software. Although eventually we will (hopefully) transition to 64-bit mode so we can do the things we expect, for now, it will be easier for us to adapt to working in 16-bit mode. Also, while in 16-bit mode, we have access to the BIOS, a built-in set of utility operations which allow us to perform input/output relatively easily. After we switch to 64-bit mode, the BIOS is unavailable, so communicating with the user becomes much more difficult and complex.

In 16-bit mode, although we have access to the 32-bit registers, all memory addresses are 16 bits. I.e., we can only access 64KB of memory! That’s clearly not what we want, so 16-bit mode makes heavy use of segmentation. Segmented memory is enabled on startup, we don’t have to do anything to turn it on.

Memory locations are of the form SEGMENT:ADDRESS and the effective address is computed as SEGMENT * 0x10 + ADDRESS. E.g.,

mov word [es:si], ax

will move the word currently in ax into the memory location es * 0x10 + si. es is one of the segment registers; the segment part of an address can be either a constant, or one of the segment registers. Most instructions will use a default segment: e.g., mov defaults to ds, the data segment, so

mov word [si], ax

is actually equivalent to

mov word [ds:si], ax

Every memory access will involve either an explicit segment (constant), or a segment register.

If both SEGMENT and ADDRESS are limited to 16 bits, how much memory can we access? 64KB × 0x10 = 1MB. Note that addresses in a 1MB address space require 20 bits to be represented; the original memory controller for x86 only had 20 lines. It is possible to access addresses higher than 1MB, however, the original behavior was to wrap these addresses around, so this is still what many PCs do by default. To access more memory, the A20 line must be enabled.

Access to data/code in the current segment is usually called near, while access to data/code in a segment other than the current one is called far. The latter requires loading the relevant segment register first, and thus is slower. E.g., some older versions of C made a distinction between “near” pointers (pointers to data in the current segment) and “far” pointers (a pointer to a different segment). Note that the two kinds of pointers have completely different representations! A near pointer is just an unsigned 16-bit value, but a far pointer must store both the segment and the address, and thus must be 32 bits. (Even worse, consider what happens when p == q where p and q are pointers to different, but possibly overlapping, segments.)

Segment registers

The segment registers are

CSCode segment (used by jmp)
DSData segment (used by mov)
SSStack segment (used by push)
ESExtra segment (used by string ops)
FSGeneral purpose segments
GS

The string operations which use both si and di implicitly use both the data and extra segments: ds:si and es:di.

To write a value into a segment register, we have to first move it into a general-purpose register, and then into the segment register:

mov ax, 0x10000
mov ds, ax

This means that a “far” memory access requires three instructions instead of just one for a “near” access:

mov ax, 0x10000
mov fs, ax
mov dword [fs:addr], ebx

Segment registers can also be pushed/popped onto/from the stack. The cs register cannot be directly modified, as it controls where the currently-executing program is located in memory; i.e., the instruction pointer is effectively cs:ip. It is changed implicitly by a “far jmp” (a jump of the form jmp SEGMENT:ADDRESS), a far call, or a far ret. Because the segment registers have such a big effect on memory access, they should all be treated as callee-preserved, and pushed before any function calls.

Some things to note:

32-bit mode

In 32-bit mode, without paging enabled, the segment part of each address is no longer just multiplied by 0x10, but rather it is treated as an index into a table of segment descriptors, called the global descriptor table. The GDT stores, for each segment, its starting address and length, as well as a few other bits of information.

Of course, in 32-bit mode, addresses are already 32 bits, so the easiest thing to do is to set all the segment registers to 0, and then load GDT[0] with a segment descriptor for a segment starting at address 0, and having the same size as the amount of memory, thus making every logical address map directly to the same physical address. This is called “32-bit flat mode”.

Memory map

The 1MB of memory that is traditionally available to us in 16-bit mode can be mapped out as:

start end size type description
Low Memory (the first MiB)
0x00000000 0x000003FF 1 KiB RAM - partially unusable (see above) Real Mode IVT (Interrupt Vector Table)
0x00000400 0x000004FF 256 bytes RAM - partially unusable (see above) BDA (BIOS data area)
0x00000500 0x00007BFF almost 30 KiB RAM (guaranteed free for use) Conventional memory
0x00007C00 (typical location) 0x00007DFF 512 bytes RAM - partially unusable (see above) Your OS BootSector
0x00007E00 0x0007FFFF 480.5 KiB RAM (guaranteed free for use) Conventional memory
0x00080000 0x0009FFFF 128 KiB RAM - partially unusable (see above) EBDA (Extended BIOS Data Area)
0x000A0000 0x000FFFFF 384 KiB various (unusable) Video memory, ROM Area

The BIOS Data Area and Extended BIOS Data Areas are essentially “RAM for the BIOS”. As the BIOS has functions and global variables of its own, it needs a place to put its stack and data segments. A few bytes of the BDA are standardized and useful:

Input/Output

x86 systems support several methods for performing input/output:

Interrupts

At its most basic level, the interrupt system is just an array of 256 pointers to functions (dwords, because they include both the segment and offset), stored starting at memory address 0x0. When interrupt number n occurs, the system looks up the entry at index n in the table and calls its function. Some of these functions are provided for us, by the BIOS, and are essentially to the operation of the system. Others are just do-nothing stub functions, intended to be replaced by pointers to our functions, so that we can control what happens when the specified interrupt occurs.

Interrupts can be divided into three groups:

The interrupts are mapped to the code that handles them through the interrupt vector table. The IVT in 16-bit mode is stored in the first 256×4 bytes of memory, starting at physical address 0. Each dword is the address of an interrupt handler, a procedure which the processor runs automatically when the n interrupt occurs. The addresses are always stored as SEG:ADDR pairs, with the offset in the low word and the segment in the high; the effective address is computed as 0x10 * SEG + ADDR. Some interrupts do double-duty: fired both when an event occurs, but also usable by our code to call a BIOS function. Many interrupts have subfunctions, specified by the value in ah or other registers.

In 32- and 64-bit modes, the IVT is called the interrupt descriptor table. Each entry is more than just an address, and the location of the table is not fixed at address 0x0, but specified via the idtr register (loaded with the lidt instruction).

For a reference to every interrupt ever (and a website straight outta 1997), see Interrupt Jump Table.

A simple bootloader

We’ll begin by writing a simple bootloader program which will just display Hello, world! on the screen and then go into an infinite loop, effectively hanging the (virtual!) machine.

Structure of a bootloader

A bootloader must be exactly 512 bytes long, and must end with the word value 0xAA55, which tells the BIOS that this is a valid (bootable) record. The BIOS will load the entire 512 bytes into memory, starting at address 0x7c00

Assembling the bootloader

We will have to assemble the bootloader manually, not using asm, as we do not want to generate an object file. Instead, we want a raw .bin file containing nothing but the assembled machine code. We also have to inform the assembler of the starting address of our code, via the org directive (remember that the system will load the 512 bytes of the bootloader into memory at address 0x7c00):

;; 
;; hello-boot.s
;;
bits 16
org 0x7c00

; Boot code begins here
; ...

; Hang system
loop:       jmp loop

; Pad remainder with 0 bytes
times 510 - ($ - $$)    db 0

; Write boot signature at end
dw 0xaa55

Note that while we can create a .s file which results in a binary larger than 512 bytes, and we can create a disk image containing this binary, the system will only load the first 512 bytes into memory; if we want to load any additional code from disk, we have to do it manually.

The bits directive tells YASM to generate 16-bit code. This is not strictly necessary, as it will default to generating 16-bit code when outputting a bin file, but helps to make it obvious what we’re doing.

To assemble this we run

yasm hello-boot.s -f bin -o boot.bin

Writing to the screen: memory-mapped IO

To write Hello, world! to the screen, we’ll embed the string in our code, somewhere where it won’t be interpreted as instructions, obviously, and then copy it to the memory-mapped display address. There are no sections in this program; so any data has to be placed within the program, and then the program setup so that it jumps over the data.

In text mode, the display memory starts at address 0xB8000. Each character cell consists of two bytes: the first is the character, and the second is its attributes (foreground and background color). When we copy the string to the screen, we have to copy the characters to the character bytes, and skip over the attribute bytes.

In addition, in 16-bit mode, the allowed registers that can be used as indexes in memory operands are very restricted: only bx, si, and di can be used, and no scale is allowed.

; Set 80x25 text mode
mov ah, 0x0
mov al, 0x3
int 0x10

; Setup fs segment to access video RAM
mov ax, screen_addr / 16
mov fs, ax

; Print text
mov cx, strlen     ; Loop counter
mov bx, 0          ; String index
mov si, 0          ; Memory index
print:
    mov bl, byte [bx + string]   ; Load current char.    
    mov byte [fs:si], bl   ; Print to screen    
    inc bx
    add si, 2
loop print

; Infinite loop
forever: jmp forever

; Unreachable, 
string:         db      "Hello, world!"
strlen:         equ     $-string
screen_addr:    equ     0xb8000

; Padding bytes and signature...

To be on the safe side, we specifically set the video mode to text mode. Most BIOSes will do this for us, but a few will not.

Note that because the bin file format does not support sections, there is no .data section: we simply place the string directly in the executable itself, placing it after the infinite loop so that it will not be executed.

Assembly, disk image, run in QEMU

To assemble this we run

yasm hello-boot.s -f bin -o boot.bin

This creates the pure binary file boot.bin. From this, we will have to create a disk image so that the emulator can boot it.

dd if=boot.bin of=boot.dsk bs=512 count=2880

This disk image emulates a traditional 1.4MB floppy disk. We’ll have to repeat this step every time we modify our bootloader’s code.

Finally, we can start the emulator with the specified disk:

qemu-system-i386 -curses -drive format=raw,file=boot.dsk

(Note that this is actually loading our disk as a hard drive disk image, rather than as a floppy disk. It doesn’t make a difference right now, as they both have the same MBR structure, but later, when we want to read more from the disk than just the bootloader, we’ll need to know where we were booted from.)

To quit (because the emulator is intercepting all our key presses!) press Esc followed by 2, to switch to the QEMU console, and then type quit.

Writing to the screen, method 2: BIOS

Instead of writing directly to video memory, we can also use BIOS calls to write one character at a time. Most of the BIOS calls for dealing with video are via interrupt 0x10, the same interrupt we used above to set the video mode.

All of the interrupts we want to use will use ax for the subfunction, bl as the page number (for us this should always be 0), as well as cx and dx for various things. We’ll have to do a bit more work to make this fit with our registers.

To write a single character to the screen, we invoke interrupt 0x10, subfunction ah = 0x0a. al should be the character to display. Note that this doesn’t move the cursor! To set the cursor position, we have to use subfunction ah = 0x02, with the cursor row/column in dh:dl.

To write the character in al on the screen, at the current cursor location:

mov ah, 0x0a
mov al, character...
mov cx, 1 
int 0x10

We have to keep track of the cursor position ourselves, but fortunately bx (the current index into the string) can do double-duty as the cursor position.

mov ah, 0x0a
mov dh, 0
mov dl, bl

Shuffling a few registers around gives us

bits 16
org 0x7c00

; Set 80x25 text mode
mov ah, 0x0
mov al, 0x3
int 0x10

; Print text
mov si, 0          ; Memory index/cursor position
print:
    ; Print character
    mov ah, 0x0a    ; Subfunction = write char
    mov al, byte [si + string]
    mov bh, 0       ; Page = 0
    mov cx, 1       ; Write count = 1
    int 0x10

    ; Move cursor
    inc si
    mov ah, 0x02    ; Subfunction = set cursor pos.
    mov bh, 0       ; Page = 0
    mov dh, 0       ; Cursor row = 0
    mov dx, si      ; Cursor col = si
    mov dh, 0
    int 0x10

    cmp si, strlen
    jne print

; Infinite loop
forever: jmp forever

; Unreachable, 
string:         db      "Hello, world!"
strlen:         equ     $-string
screen_addr:    equ     0xb8000

; Pad remainder with 0 bytes
times 510 - ($ - $$)    db 0

; Write boot signature at end
dw 0xaa55

You may note a difference between this and the previous example: the cursor in this example is left at the end of the string printed, while in the direct-memory-write example it is left at the beginning, because writing directly to memory has no effect on the cursor position.

Connecting a debugger

It’s possible to connect a debugger (GDB) to the virtual machine, so we can debug our code. To do this, you’ll have to choose a port which no one else on the server is using. I’d suggest picking a random port between 9000 and 10000.

Start the emulator with

qemu-system-i386 -S -gdb tcp::9XXX -curses -drive format=raw,file=boot.dsk

(Replace XXX with your chosen port.) This will start the emulator stopped, not running, waiting for a connection from GDB. Then, open another SSH session to the server, run GDB (gdb) and type

target remote localhost:9XXX

using the same port number as above. From there, you can set breakpoints (but only at numerical addresses), continue to continue the boot processes, etc.