Review of last time
Getting into 32-bit protected mode
Getting into protected mode is the first step to getting into 64-bit “long” mode. The steps for doing so are
Disable interrupts. We don’t want an interrupt to fire while we are changing the system mode, as the interrupt handler won’t work correctly. The
cli
instruction will do this.Enable the A20 line, to allow for the larger address space. (Remember that in 16-bit mode, we can only access a 20-bit address space.) There are several ways of doing this, of varying levels of safety and complexity. We choose to use the “fast A20” port, which works on the emulator we’re using, so this is just
in al, 0x92 or al, 2 ; Set bit 1 out 0x92, al
Load the Global Descriptor Table (GDT) with segment offsets. In 32-bit mode, instead of the values in the segment registers being used directly, as segment addresses, they are indexes into a table, the GDT, where each entry in the table contains information about that segment.
This is the most complex part of the process, as the table entries have a lot of information about the segments in them.
Switch to 32-bit mode, by setting the low bit of register
CR0
. This is justmov eax, cr0 or eax, 1 mov cr0, eax
but afterwards we want to do a “long jump” to flush the pipeline:
jmp 0x8:protected_mode bits 32 protected_mode: ...
Segment 0x8 is code segment we configured in the GDT.
Install a 32-bit compatible interrupt table (IVD).
Re-enable interrupts via
sti
Disable interrupts
The normal interrupt handlers installed by the BIOS will only work in 16-bit mode. We don’t want them accidentally firing in the middle of the switching process, so we disable them. (Note that even if nothing “obvious” happens to fire an interrupt, no keyboard input or whatever, interrupts can still be fired. E.g., there is a timing interrupt which fires 18 times every second.)
Interrupts are controlled by the IF flag in the flags register, so to disable them we just clear this flag:
cli
At the end, after we’ve installed our own 32-bit interrupt handlers, we can re-enable them by setting the flag with
sti
Install GDT
Recall that in 32-bit mode, the segment registers do not contain actual addresses, but indexes into a table of segment descriptors. This table is called the global descriptor table and must be built, by us, to describe the segments available.
Theoretically this is just a matter of writing a GDT in our (pseudo) data
section, copying it into memory in a suitable location, and then using the
lgdt
instruction to load it into the CPU. However, the table’s entries have
quite a complex structure which will required some understanding. The basic
decisions we need to make for each segment are
Starting address (base) and size (limit)
Type: code or data
For data, read-only or read-write?
The GDT is an array of segment descriptors, where each segment descriptor should have the following form:
struct seg_desc {
unsigned short limit; // Segment size (low 16 bits)
unsigned short base_low; // Low 16 bits of segment base address
unsigned char base_mid; // Middle 8 bits of seg. base
unsigned char type : 5; // Segment type, attributes
unsigned char priv : 2; // Privilege level
unsigned char present : 1; // Is segment present?
unsigned char limit_high : 4; // High 4 bits of seg. size
unsigned char attr : 3; // More attributes
unsigned char granularity : 1; // Affects segment size
unsigned char base_high; // High 8 bits of segment base
};
63-56 | 55 | 54-52 | 51-48 | 47 | 46-45 | 44-40 | 39-32 | 31-16 | 15-0 |
Base 24:31 | Gran | Attr | Limit 16:19 | Present | Priv | Type | Base 16:23 | Base 0:15 | Limit 0:15 |
Note that segment bases have 32-bits, spread out over the fields base
,
base_mid
, and base_high
. Segment limits (sizes) have 20 bits, meaning that
the largest segment is 1MB, but this is affected by the granularity bit. If
the granularity bit is set, then the limit is interpreted not as number-of-bytes,
but number of 4KB frames, which allows for a max segment size of 1MB * 4KB or
4GB, which is the total amount of memory addressable in 32-bit mode anyway.
The entire structure is 64-bits or one qword.
Entry 0 in the table is reserved for the null segment; if you try to use the null segment, a processor exception will occur, so we load the first entry of the table with 0:
gdt:
dq 0 ; Null segment
We will define four entries above this in the segment table:
A segment for code, starting at 0x7c00 and extending to 0x7FFFF (size 0x78400 bytes). This is the normal place where our first and second stages will be loaded.
A segment for data, starting at 0x100000 and extending to 0xEFFFFF (about 14MB or 0xE00000 bytes). This is located in the “extended memory”, above 1MB, and hence is only accessible after we have enabled the A20 line. Note that “global” data will still also be located inside the code segment, because it gets loaded with the rest of the bootloader.
As this segment is larger than the normal segment max size, it will have to use the granularity bit. Thus, the limit will be 0xE00000 / 4096 = 0xE00.
A segment for the stack, overlapping the previous. We’ll set things up so that the stack grows backwards, from the end of the segment.
A segment for direct access to video memory, starting at address 0xb8000 extending to 0xBFFFF with a size of 32KB (0x8000).
Once we have the GDT setup in memory, we need to load it into the CPU, using
the lgdt
instruction. The GDT descriptor tells the CPU both the address
at which the GDT exists, and also how many entries are in it. The low 16 bits
are the size of the GDT (in bytes, not in entries!), while the high 48 bits
are the address in physical memory.
gdt:
dq 0 ; Segment 0
; Segment 1 -- Code
dw 0x8400 ; Low 16 bits of limit (total 0x78400)
dw 0x7c00 ; Low 16 bits of base
db 0 ; Middle 8 bits of base
db 0b10011010 ; Present (1 bit), Priv (2), S, Ex, Dc, Rw, Ac
db 0b01000111 ; Gran (1), 16/32 (1), 0s (2), Limit high (4)
db 0 ; High 8 bits of base
; Segment 2 -- Data
dw 0x0E00
dw 0x0000
db 0x10
db 0b10010010
db 0b11000000
db 0
; Segment 3 -- Stack (identical to previous)
dw 0x0E00
dw 0x0000
db 0x10
db 0b10010010
db 0b11000000
db 0
; Segment 4 -- Video
dw 0x8000
dw 0x80000
db 0xb
db 0
db 0b10000010
db 0b01000000
db 0
gdt_limit: equ $ - gdt
We can load this directly into the GDTR register by using a combination of
gdt
and gdt_limit
. The lgdt
instruction expects an address of a GDTR
structure, which consists of a word, giving the GDT limit (size of the GDT, in
bytes), followed by a dword, giving the base (linear) address of the GDT table
itself. We can allocate this structure immediately after the GDT:
gdtr_struct:
dw gdt_limit
dd gdt
and then load it with
mov ax, 0
mov ds, ax ; Make address of gdtr linear
lgdt gdtr_struct
If we were writing a kernel capable of running multiple processes, we’d want to set up some additional segments:
A TSS segment stores information about each task (process or thread), used by the CPU during task switches.
Local descriptor table entries; each task can have its own set of segments, which it sees as if they were the entire GDT. This can be used to give each task a separate set of segments (its own code, data, stack, etc.).
Finally, we can setup the segment registers. Note that the values in the segment registers are not strictly indexes into the GDT, but rather byte offsets. Thus, the segment numbers above should be multiplied by 8, the size of a single descriptor.
Finally we reload all the segment registers
mov ax, 1 * 8
mov cs, ax
mov ax, 2 * 8
mov ds, ax
mov ax, 2 * 8
mov ss, ax
mov ax, 3 * 8
mov es, ax
And then “jump” into the “new” code segment:
jmp 0x8:protected_mode
bits 32
protected_mode:
...
Install 32-bit mode IVD
We cannot reenable interrupts until we have a set of 32-bit-compatible interrupt handlers, and a table pointing to them. We have to actually write the interrupt handlers (although many of these can just be do-nothing routines), and then fill in a table to point to them.
The interrupt descriptors have the form:
63-48 | 47-40 | 39-32 | 31-16 | 15-0 |
High offset | Type/Attr | Zero | Segment | Low offset |
The location of the handlers are specified as a segment selector (i.e., offset into GDT) and then offset within the segment. The offset is 32-bits, but split up within the structure. All of our handlers will be within our code segment, so Segment will be 0x8, and then the addresses will just be relative to the start of that segment.
The Type/Attribute byte looks like
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Present | Priv | S | Type |
Present should be set to 1 for used interrupt descriptors.
Priv. gives the privilege level of the handler. This has the same meaning as privilege levels in the segment descriptors, what privilege level does the interrupt run at.
S should be set to 0 for interrupt and trap handlers.
Type will be one of 0b1110 (software interrupt), or 0b1111 (hardware interrupt). The other gate types are for task-switch interrupts, or for 16-bit interrupts (the IDT can have both 32- and 16- bit handlers in it). Remember that software interrupts are routines that client code will call, whereas hardware interrupts are generated by the system itself.
Like the GDT, the lidt
instruction expects the address of an IDTR structure
in memory, which should have the structure
lidt:
...
lidt_size: equ $ - lidt
lidt_struct:
dw lidt_size ; Limit/size
dd lidt ; Base address
which we then load with
lidt lidt_struct
Interrupt handlers and the PIC
There are 256 possible interrupts that can be fired. Which of these do we need to write handlers for, and how do we write them?
Writing an interrupt handler is fairly easy: just end it with the iret
instruction rather than the normal ret
instruction.
The part of the system which interprets hardware interrupts is called the Programmable Interrupt Controller (PIC). It essentially filters/remaps interrupts as they are received, before running the actual interrupt handler. To add to the complexity of the system, there are actually two PIC chips in the system; one was not enough, but rather than redesign the chip, they just chained two of them together. One is called the master PIC and the other the slave.
Communication with both PICs is done via a pair of ports: one for commands and one for data:
Master Cmd | 0x20 |
---|---|
Master Data | 0x21 |
Slave Cmd | 0xA0 |
Slave Data | 0xA1 |
Note that between the two PICs, there are a total of 15 possible hardware interrupts. (Each PIC provides 8, and one is used for communication between the two PICs.) One of the most basic operations of the PIC is to determine the mapping from hardware interrupt numbers (0-15) to system interrupt numbers (i.e., entries in the IVD). (Internally, interrupt 2 is used for inter-PIC communication but the PIC normally remaps hardware interrupt 9 to 2, so if you receive interrupt 2, it was originally 9.)
The master PIC is responsible for interrupts 0-7, while the slave is responsible for 8-15. Each PIC has a vector offset which is added to the (hardware) interrupt number to get the (IVD) interrupt index seen by the CPU. This means that the first 8, and second 7 hardware interrupts can be mapped to different parts of the IVD. Both vector offsets must be multiples of 8.
In 16-bit mode, the default mapping is to map hardware interrupts 0-7 to system interrupts 8 - 0xf, and hardware 8-15 to 0x70 - 0x7f. In 32-bit mode, the first 32 system interrupts are reserved, so at a minimum we have to remap hardware interrupts 0-7 to a different part of the table.
In order to remap the PIC, you have to reinitialize it from scratch, essentially “rebooting” that component of the system. Thus, remapping the PIC is a rather complex procedure.
When we start protected mode, we have to reinitialize the PICs by sending the initialize command, 0x11. After this command, we send three initialization words, telling the the PIC
Its vector offset
How the master/slave connection is setup
Some additional information
See here for the details.
When an interrupt routine ends, before it calls iret
, it needs to signal
to the PIC that it is finished. This is done by issuing command PIC_EOI
,
end-of-interrupt, 0x20. This command must be sent to the PIC (master, slave)
which originated the interrupt, so depending on which PIC it came from, we
either do
mov al, 0x20 ; PIC_EOI
out 0x20, al ; Signal master
or
mov al, 0x20 ; PIC_EOI
out 0x70, al ; Signal master
Masking interrupts
The PIC has the ability to mask (temporarily ignore) certain interrupts. Each PIC has a mask register which is 8 bits wide. Each bit of the mask corresponds to one of the interrupt lines connected to that PIC. If a bit is set, then the PIC will ignore any signals on the corresponding line; if it is unset, the corresponding register functions normally.
Masking is done via the data port: read a byte from the data port to get the current mask, and then write the (modified) value back to the data port to set the mask.
Multi-stage bootloader
;;;
;;; two-stage.s
;;; Illustrates a two-stage loader, where the first stage invokes the BIOS
;;; to load the second stage.
;;;
bits 16
org 0x7c00
start:
origin: equ 0x7c00
blk_count: equ (end - loaded_code) / 512 + 1
; -----------------------------------------------------------------------------
; First stage loader
; Reset disk
mov ah, 0x0 ; Subfunction reset
mov dl, 0x80 ; Disk number
int 0x13
; Load blocks
mov ah, 0x42 ; Int 0x13, subfunction Extended Read
mov dl, 0x80 ; Drive num
mov si, disk_packet ; Packet address
int 0x13
jmp loaded_code
; ----------------------------------------------------------------------------
; Begin "pseudo-data" section
string: db "Hello, world!"
strlen: equ $-string
screen_addr: equ 0xb8000
align 2
disk_packet: db 0x10 ; Packet size
db 0 ; Reserved
dw blk_count ; Block count
dd loaded_code ; Addr. to load
dd 1 ; Starting block
; Pad remainder with 0 bytes
times 510 - ($ - $$) db 0
; Write boot signature at end
dw 0xaa55
; -----------------------------------------------------------------------------
; Begin second-stage loader
loaded_code:
; Set 80x25 text mode
mov ah, 0x0
mov al, 0x3
int 0x10
; Print text
mov si, 0 ; Memory index/cursor position
print:
; Print character
mov ah, 0x0a ; Subfunction = write char
mov al, byte [si + string]
mov bh, 0 ; Page = 0
mov cx, 1 ; Write count = 1
int 0x10
; Move cursor
inc si
mov ah, 0x02 ; Subfunction = set cursor pos.
mov bh, 0 ; Page = 0
mov dh, 0 ; Cursor row = 0
mov dx, si ; Cursor col = si
mov dh, 0
int 0x10
cmp si, strlen
jne print
; Infinite loop
forever: jmp forever
end:
; Pad so there's a good number of blocks used in the disk
times 1024 * 1024 db 0