2015-06-14

2015-06-14 Virtual Memory

Up until now, all of my kernels have had flat, physical memory models.  This has been useful before now because it has simplified the development of many components, not least of which being the device drivers which often need to provide physical addresses to devices, or to map physical buffers into their memory space before they can be accessed.  I can have multiple tasks running at the same time in this model by using what other systems would call multi-threading.  (I have some test kernels from some time ago where I added multi-threading support and could run multiple tasks at once, but these were limited to writing a character to the screen then waiting for some time.)

I now want to break that boundary and make my kernel more mature by introducing full multi-processing abilities with multi-threading and potentially "Thread-local Storage" (an area of data and/or bss in the executable that is copied per thread so that each thread can have global variables that are separate from any other thread).  To introduce multi-processing, I really need to get virtual memory working.

I posted an article last month about broadly how I was planning to implement this, much of that was actually so I could get the ideas straight in my mind before I tried to do it.

I have now implemented the the first part of that plan.  The kernel is now linked to address 0xC0100000 (3GB+1MB) and gets loaded by the multiboot loader to 1MB physical.  This is all achieved using the linker script (linker.ld) but with a few modifications:

SECTIONS
{
 . = 0xC0100000;

 .text ALIGN(4K) : AT(ADDR(.text) - 0xC0000000)
 {
  *(.text.multiboot)
  *(.text)
 }
 
 ...

The ". = 0xC0100000" sets the linking address to be where I wanted it (3GB+1MB), so when a function call or other JuMP in my code tries to jump to an absolute memory address, it jumps to somewhere in the range 0xC0100000 to 0xC0164000 (the approximate current start and end of my kernel).  If I only did this, the multiboot loader would have tried to load the kernel to that location in physical memory which would have been bad, there could be device memory, BIOS structures, or even nothing at all at that PHYSICAL location (especially if you have fewer than 3GB of RAM in your machine).  That's where the next modification to the link script comes in which is the AT() directive which tells tells the linker to create an executable which loads the section to the given PHYSICAL address (in this case, we subtract 3GB from the address, so 3GB+1MB becomes 1MB).

With these two changes, the multiboot header will still load the kernel to 1MB PHYSICAL, and all the JuMPs in the code will point to somewhere above the 3GB+1MB mark.

The linker script is also responsible for telling the program loader where to start executing the program via the ENTRY() directive.  Because this address will be called before paging is enabled, it needs to be changed to be the physical address of my entry point (the first piece of my OS code which will run when the system is booted).  In my boot.s assembly file (which contains the entry point), I have this code:

.global _start
.global _start_p

.set _start_p, _start - 0xC0000000

.text
_start:

I define two symbols here, _start is the symbol for the actual entry point (which will be at 0xC0100000 somewhere), and _start_p is the symbol for the physical address of the entry point (at 0x100000 somewhere).  The ENTRY() directive in the link script now references _start_p.

The last place changes were needed were in the actual _start function itself.  It needs to set up paging and I also put the page directory in here.  Here it is in completion:

.text
_start:
 # Load the physical location of the page directory.  This has to map the kernel to the 1MB mark, and to the C1MB mark at the same time.
 mov $boot_page_directory - KERNEL_ADDRESS_V, %ecx
 mov %ecx, %cr3

 mov %cr0, %ecx   # Set the paging bit in CR0
 or 0x80000000, %ecx
 mov %ecx, %cr0

 movl $kernel_start, %ecx
 jmp *%ecx   # This makes an absolute jump to the virtual 0xC0+1MB


.section .data
.align 0x1000
boot_page_directory:
 # This entry is 0MB to 4MB (0x0 to 0x400000 of 0x100000000)
 .long ( boot_page_table - KERNEL_ADDRESS_V ) + 7
 .rept 767
 .long 0
 .endr
 # This entry is 3GB to 3GB+4MB (0xC0000000 to 0xC0400000 of 0x100000000)
 .long ( boot_page_table - KERNEL_ADDRESS_V ) + 7
 .rept 254
 .long 0
 .endr
 # This is the last 4MB, which references the page directory itself
 .long ( boot_page_directory - KERNEL_ADDRESS_V ) + 7

.align 0x1000
boot_page_table:
 # Each entry in here represents 4KB of this 4MB.
 # This page table is used for both the 1MB and the 0xC0+1MB page directory entries
 .set page_table_count, 7  # Set the initial flags value
 .rept 1024
 .long page_table_count   # Set this page table entry
 .set page_table_count, page_table_count + 0x1000  # Then increment the value for the next time around.
 .endr


As you can see, the code is minimal.  It loads a special CPU register (CR3) with the address of the Page Directory, then it sets the PAGING bit of another special CPU register (CR0), and finally jumps to the virtual address of the kernel's first C function.

I've specified that both the page directory and page table should be in the "data" section of the program using the ".section .data" directive, and also that they should be aligned to a 4KB boundary using the ".align 0x1000" directive.  I then build the two tables using the ".rept" directive to repeat values as many times as I need, in the directory to repeat the 0 entries, and also in the page table to create an identity mapped page table.

That's it, all done, paging is set up and every works beautifully ...except it doesn't.  As ever, I encountered problems getting it to actually work.

The first was an easy problem, the memory manager added a big chunk of physical memory to the free list ready to be used by the malloc() routine, but I couldn't access that memory any more.  I modified the memory manager initialisation code so that it added the chunk of memory from the end of the kernel to the end of the mapped 4MB, giving the kernel approximately 2.5MB of memory available for use.

The second problem was an odd one that I don't quite understand yet.  I use LGDT and LIDT instructions to load the Global Descriptor Table (GDT) and Interrupt Descriptor Table (IDT) as you do, but both these commands stopped working when I enabled paging.  For some reason, when I passed the LGDT instruction the memory address of the GDT Descriptor directly, it didn't work with paging on (even though it worked with paging off, unless I'm going mad).  After a while of going through the disassembly and trying various different things, the only way I found to get it to work was to load the GDT address into a register, then pass the register to LGDT as a memory reference, like so:

uint32* lPhysicalAddress = gdt_gdtDesc;
asm("lgdt (%0)" : : "r"(lPhysicalAddress));

The same was true of the LIDT instruction, fixed in the same way.

The last problem was another odd one.  Many months ago when I first got Doom running, I had a problem on one of my test machines where Doom failed to load with an issue that I traced back to being a problem with FPU, and I added an instruction to the kernel load to reset the FPU with "asm("fninit");".  With Paging enabled, this instruction failed with an Interrupt 7, but commenting the line out made it work again :)  I suspect that the FPU makes use of some area of memory for caching or stack or some such which doesn't work with my current paging set-up.  It is something to investigate another day.


With all of these obstacles overcome (or at least worked around), the kernel is back to booting to a console, but with everything running in virtual memory.  With the console restored, it allows me to work toward getting the physical page allocator and page fault handler in place gradually allowing me to test each part as I write it.  I prefer this approach rather than having a non-functional kernel until everything works correctly.

I have a lot of work still to do as everything I've written so far will need tweaking where it accesses physical memory.  The only reason the VGA driver is working at the moment is because of that 1MB virtual to 1MB physical map I added earlier, so even that needs changes.

Lots to do ...