WeensyOS Minilab 3 (Memory OS): Due 4/13

The WeensyOS in the third lab demonstrates how to use the paging h/w for virtualizing memory among processes.

Setting Up

Download and unpack the source for weensyos3 using the following commands:

% tar xvzf weensyos3.tgz 
% ls weensyos3
COPYRIGHT    lib.c            memos-alloc3.ld  memos-kern.c    mkbootdisk.c
GNUmakefile  lib.h            memos-alloc4.c   memos-kern.h    types.h
answers.txt  memos-alloc1.c   memos-alloc4.ld  memos-loader.c  x86.h
bootstart.S  memos-alloc1.ld  memos-app.h      memos-pages.c   x86mem.h
conf         memos-alloc2.c   memos-boot.c     memos-x86.c     x86struct.h
console.c    memos-alloc2.ld  memos-forker.c   memos.h         x86sync.h
elf.h        memos-alloc3.c   memos-int.S      mergedep.pl

Change into the weensyos3 directory and run gmake run-memos.

This will build and run the single operating system you'll use in WeensyOS 3, the "memory OS" or MemOS. As before, this will start up Bochs, but not the emulated computer. To start the emulated computer, type "c" at the <bochs:1> prompt.

Physical Memory

Your Bochs window asks if you want to run memos-alloc* or memos-forker:

Type 1 to run memos-alloc*. You should see something like this:

[Initial MemOS Alloc Physical Memory Map]

(This image loops forever, but when you run MemOS, the bars will move to the right and stay there.)

If your bochs runs too slowly (the bars of 1-4's move slowly), edit the memos.h file and reduce the ALLOC_SPEED constant.

Here's what's going on.

MemOS displays the current state of physical memory. Each character represents 4 KB of memory: a single physical page. There is 2 MB of physical memory in total.
The MemOS runs four processes, 1 through 4.
Since the MemOS initially uses physical memory allocation, each process must use a separate region of memory. We preallocate 1/4 MB for each process.
Each process asks the kernel for more heap space, one page at a time, until it runs out of room. As usual, each process's heap begins just above its code and global data, and ends just below its stack. The processes allocate space at different rates: compared to Process 1, Process 2 allocates space twice as fast, Process 3 goes three times as fast, and Process 4 goes four times as fast. (A random number generator is used, so the exact rates may vary.)

The marching rows of 1's, 2's, 3's, and 4's show how fast the heap spaces for processes 1, 2, 3, and 4 are allocated. Here are two labeled memory diagrams, showing what the characters mean and how memory is arranged.

Read and understand the process code in memos-alloc1.c. This code is used for all 4 processes.

Virtual Memory: One Address Space

In the rest of this lab, you will gradually switch the MemOS to use virtual memory! This requires that we set up different address spaces, one for each process, and change the page allocation function, page_alloc, to allocate a physical page and map it at the required address, rather than simply allocating the physical page with the right address. First, we'll simply set up a single virtual address space that matches physical memory.

Exercise 1: Implement the page_alloc_free function in memos-pages.c. This function should use the pageinfo array to find a page that is currently free.

Then edit the start function in memos-kern.c to initialize virtual memory. After the call to pageinfo_init(), add the following line:

        paged_virtual_memory_init();

This creates an initial page directory for the kernel and all processes, then installs that page directory and turns on paged virtual memory.

If you run gmake run-memos at this point, it should work, and produce similar output as it did before. But let's take a look at each process's virtual address space as well.

Exercise 2: In the interrupt function in memos-kern.c, after the call to memshow_physical(), add the following line:

        memshow_virtual_flipper();

When you run gmake run-memos now (and enter c at the <bochs:1> prompt), you should see something like this.

There are several changes from the initial display.

As the MemOS runs, the virtual memory display cycles between the four processes' address spaces. However, all the address spaces are the same for now. (If Bochs cycles through the address spaces too slowly, reduce the MEMORY_FLIPPER_SPEED constant in memos.h.)
The new "P" character indicates a memory page used for a page directory or page table: that is, for a structure used to represent an address space. Intel 386-compatible processors use a hardware-defined two-level page table format, so each address space requires at least two pages of memory.
A character in the virtual memory display is reverse-video if a user-mode process is allowed to access the corresponding address. Thus, we can see that the processes can't modify the kernel's code, data, or stack, or the I/O memory (except for the single page that contains the console). However, each process can modify any other process's code, data, or heap, including parts of the heap that haven't been allocated yet.

Virtual Memory: Isolated Address Spaces

MemOS is now using virtual memory, but we're getting no benefit from it! Next, we'll update the kernel code to grant each process its own address space. This will isolate the processes from one another; no process will be able to alter another process's memory or data.

Several changes are required.

MemOS must create a new address space (page directory) for each process. The kernel portion of all processes' address spaces will be identical: virtual addresses [0, PROC1_START_ADDR) will map to physical addresses [0, PROC1_START_ADDR), with kernel-only permission (except that the console page allows user access). Thus, the kernel may freely switch between address spaces without causing page faults on kernel code or data.
The kernel must switch to a process's address space before loading that process's code and data into memory with program_loader(), and before running that process.
Finally, the paged_virtual_memory_init() and page_alloc() functions must be changed so that user processes start out with unmapped memory, and memory is mapped gradually as pages are allocated.

Switching to an address space uses the lcr3() function, a simple wrapper for the lcr3 instruction. The 386+'s cr3 register holds the physical address of the active page directory; kernel code may change the value of this register, and thus switch to a new address space, by calling lcr3() with the relevant page directory address.

Exercise 3: Change the process initialization code in start_process() to give each process its own independent address space. Use lcr3() to switch to each process's address space before calling program_loader(). Use the pgdir_copy() function to create an independent address space based on the kernel_pgdir.

Then, change the run() function to load a process's address space before running that process.

Exercise 4: Change the page_alloc() function to add mappings for user pages as they are allocated. You will use pgdir_set(), specifying a PTE value that includes flags for present, writable, user-accessible pages. Base your code off the relevant code in paged_virtual_memory_init().

Then, edit paged_virtual_memory_init() so that user pages are initially unmapped. Now a user page is mapped only if it was allocated by page_alloc().

When you run gmake run-memos, you should see something like this. (Note the greater number of "P" pages.)

Now each process's address space only contains that process's pages. Furthermore, since processes run in user mode (at protection level 3), the processor will not allow them to execute the lcr3 and lcr0 instructions that would install a new address space or turn off virtual memory. This means the processes are memory isolated: no process can affect another process's code or data.

Question 5: Memory isolation also depends on properties of the processes' page tables. Which types of page must be marked as inaccessible or kernel-only to guarantee memory isolation? List only those page types that might allow processes to observably affect other processes' behavior via reading or writing memory. (The console doesn't count.) Identify page types by characters used in the memory map display.

Virtual Memory: Address Mapping

MemOS now maintains an isolated address space for each process. However, it still allocates memory based on physical address. Given virtual memory, we can allocate memory far more flexibly. In this section, you will change MemOS to allocate memory independently of physical address, and to give each process a much larger heap space, allowing any process to potentially allocate most available physical memory.

Exercise 6: Edit the page_alloc function so that it can use any free page, rather than just the physical page with the given address. If no free page is available, page_alloc should return -1. Your kernel should continue to work even after physical memory is exhausted.

Then, initialize each process's stack to start at virtual address 0x300000, the top of MemOS's virtual address space.

Now when you run gmake run-memos, you should see something like this.

[MemOS Virtual Address Spaces with Address Mapping]

You have now built a virtual memory system that, in its essentials, is a lot like the virtual memory system in any modern operating system.

Notice that once physical memory is exhausted, Process 4 has used roughly four times as much memory as Process 1. This is because each process's address space is big enough to fit any available physical memory, so physical memory runs out before any process's address space does. Now that we have eliminated the requirement that processes fit in contiguous regions of physical memory, each process can allocate more memory than before!

Question 7: Process 1's code and global data used to be allocated in the physical pages with addresses 0x100000 and 0x101000. In your implementation, which physical page addresses are now used for Process 1's code and global data (i.e., not including its heap or stack)?

Extra Credit Exercise: It's not necessarily fair that Process 4 gets to use four times as much memory as Process 1 -- especially since Process 1 can get less memory than in the original physically-allocated design. Implement a quota system so that each process is guaranteed that it will be able to allocate at least 1/4 MB (64 pages) of heap space (not including any code, data, and stack space). Any space beyond that 1/4 MB should be allocated first-come, first-served. You will need to keep track of how much physical memory remains available, and how much physical memory each process has allocated. Add some text to answers.txt to describe your implementation.

Virtual Memory: Forking

Boot MemOS again, but this time, enter character 2 to run the memos-forker application. MemOS will "panic" (it will stop running immediately). This is because you haven't implemented the sys_fork() system call. The memos-forker.c application works a lot like memos-alloc*, but rather than four separate binaries, memos-forker is a single binary that forks three times to create four processes. It's your job now to make this work.

Exercise 8: Implement a handler for INT_SYS_FORK in the interrupt() function. Your handler should find an unused process descriptor, copy the current process's state into the new process descriptor (using pgdir_copy() for the page directory), and set up the reg_eax return values appropriately.

If you run memos-forker now, though, you'll see something odd!

[MemOS Forker Broken Virtual Address Spaces]

Although each heap page is only accessible to one process, the processes are still not correctly isolated!

Exercise 9: Fix this problem so that sys_fork() correctly creates isolated processes.

Hint 1: Look at pgdir_copy() in memos-pages.c.

Hint 2: Remember that process address spaces should be identical to kern_pgdir for virtual addresses less than PROC1_START_ADDR.

Extra Credit Exercise: Implement copy-on-write fork. (Our copy-on-write fork solution requires just 21 lines of code!) Pages shared between processes will have pi_refcount > 1. The virtual memory mapping shows such pages using darker colors, so you'll be able to tell that copy-on-write fork is working if memos-forker's code page (which is never written) shows up as a dark red and the processes' data and heap pages are bright colors.

This completes the minilab.

Handing in

Make sure you have answered all questions 5 and 7 (and optional extra credit exercises) in your answers.txt file.

For coding exercises, it's OK for answers.txt to just refer to your code (as long as you comment your code).

Run the command make tarball to create a file named weensyos3-yourusername.tar.gz. This tarfile contains the answers.txt file.

Submit the weensyos3-yourusername.tar.gz file here.