Minilab 1 - Spring 2009

WeensyOS Minilab 1

Introduction

Please first read the lab overview to set up your environment.

The first WeensyOS problem set concerns processes. The MiniprocOS is a tiny operating system that supports the major process primitives: creating a new process with fork, exiting a process, and waiting on a process that has exited. The only thing missing -- and it's a big one -- is process isolation: MiniprocOS processes actually share a single address space. (In a later mini-lab you will implement process isolation for memory.) In this lab, you will actually implement the code that forks a new process, and see how system calls are implemented. You will also update the code that waits on an exited process to avoid busy waiting.

Setting up

You could take one of the disk image files this minilab builds, write it to your laptop's hard drive, and boot up your operating system directly if you wanted! However, it's much easier to work with a virtual machine or PC emulator.

An emulator mimics, or emulates, the behavior of a full hardware platform. A PC emulator acts like a Pentium-class PC: it emulates the execution of Intel x86 instructions, and the behavior of other PC hardware. For example, it can treat a normal file in your home directory as an emulated hard disk; when the program inside the emulator reads a sector from the disk, the emulator simply reads 512 bytes from the file. PC emulators are much slower than real hardware, since they do all of the regular CPU's job in software -- not to mention the disk controller's job, the console's job, and so forth. However, debugging with an emulator is a whole lot friendlier, and you can't screw up your machine!

If you work on a Linux machine provided by us, you are all set to use the pre-installed Bochs emulator. Otherwise, follow these instructions to install a copy on your own computers.

You will also need a copy of GCC that compiles code for an x86 ELF target. Again, the class Linux machines have everything set up for you already.

Download lab1 source, unpack and examine the source for weensyos1 using the following commands:

% tar xzvf weensyos1.tgz
% ls weensyos1
COPYRIGHT    console.c    mkbootdisk.c  mpos-kern.c      mpos.h
GNUmakefile  elf.h        mpos-app.c    mpos-kern.h      types.h
answers.txt  lib.c        mpos-app.h    mpos-symbols.ld  x86.h
bootstart.S  lib.h        mpos-app2.c   mpos-int.S       x86struct.h
conf         mergedep.pl  mpos-boot.c   mpos-x86.c
%

Now that you've unpacked the source, it's time to give the OSes a whirl.

Change into the weensyos1 directory and run the make program.

The WeensyOS Makefile (well, GNUmakefile) builds a hard disk image called mpos.img, which contains the MiniprocOS "kernel" and two applications, mpos-app.c and mpos-app2.c.

Type make to build the kernel image. Type make V= if you want to see the compile flags verbatim. Make's output should look something like this:

% make
+ hostcc mkbootdisk.c
+ as bootstart.S
+ cc mpos-boot.c
+ ld mpos-bootsector
+ cc mpos-kern.c
+ cc mpos-x86.c
+ as mpos-int.S
+ cc lib.c
+ ld mpos-kern
+ cc mpos-app.c
+ cc console.c
+ ld mpos-app
+ mk mpos.img
+ cc mpos-app2.c
+ ld mpos-app2
+ mk mpos2.img
%

Now that you've built the OS disk image, it's time to run it! We've made it very easy to boot a given disk image; just run this command:

% make run-mpos

This will start up Bochs, but not yet the emulated computer. (This is because Bochs is giving you a chance to set breakpoints on the emulated machine.) To start the emulated computer, type "c":

<bochs:1> c

After a moment you should see a window like this!

Hit "1" to try to run the first application, and you should see a window like this:

To quit Bochs, click the "Power" button in the upper-right corner. (Very funny, Bochs.)

QEMU Note: If you're running QEMU instead of Bochs, run the MiniprocOS with qemu -hda mpos.img. (The -hda option stands for Hard Disk A.) QEMU doesn't have a funky power button; just hit Control-C in the terminal to quit. QEMU will sometimes "grab" the keyboard, which prevents you from doing anything else. If you appear to have lost control of your computer, check QEMU's title bar: it may say something like "Press Ctrl-Alt to exit grab". Press Ctrl-Alt and things should return to normal.

MiniprocOS Application

You're now ready to start learning about the OS code!

Start first with the application, mpos-app.c. This application simply starts a single child process and waits for it to exit. It uses system calls that implement the process functions we discussed in class: fork starts a new process; exit exits a process; and wait returns a process's exit status.

Read and understand the code in mpos-app.c.

You might wonder how those system calls are implemented! As discussed in class, to call a system call, the application program executes a software interrupt instruction, which transfers control to the kernel. The system call's arguments are often stored in machine registers, and that's how MiniprocOS does it. Likewise, the system call's results are often returned in a machine register. On Intel 80386-compatible machines (colloquially called "x86s"), the interrupt instruction is called int, and registers have names like %eax, %ebx, and so forth. A special C language statement, called asm, can execute the interrupt instruction and connect register values with C-language variables.

Read and understand the comments in mpos-app.h. This file defines MiniprocOS's system calls. Also glance through the code, to see how system calls actually work!

The MiniprocOS kernel handles these system calls.

This kernel is different from conventional operating system kernels in several ways, mostly to keep the kernel as small as possible. For one thing, the kernel shares an address space with user applications, so that user applications could write over the kernel if they wanted to. This isn't very robust, since the kernel is not isolated from user faults, but for now it is easier to keep everything in the same address space. Another difference is that MiniprocOS implements cooperative multitasking, rather than preemptive multitasking. That is, processes give up control voluntarily, and if a process went into an infinite loop, the machine would entirely stop. In preemptive multitasking, the kernel can preempt an uncooperative process, which forces it to give up control. Preemptive multitasking is more robust than cooperative multitasking, meaning it's more resilient to errors, but it is slightly more complex. All modern PC-class operating systems use preemptive multitasking for user-level applications, but the kernel itself usually switches between internal tasks using cooperative multitasking.

MiniprocOS's main kernel structures are as follows.

struct procdescriptor_t: This is the process descriptor structure, which stores all the relevant information for each process. It is defined in mpos-kern.h.
procdescriptor_t miniproc[];: This is an array of process descriptor structures, one for each possible process. MiniprocOS supports up to 15 concurrent processes, with process IDs 1 to 15. The process descriptor for process I is stored in miniproc[I]. Initially, only one of these processes is active, namely miniproc[1]. The miniproc[0] entry is never used.
procdescriptor_t *current;: This points to the process descriptor for the currently running process.

The code in mpos-kern.c sets up these structures. In particular, the start() function initializes all the process descriptors.

Read and understand the code and comments in mpos-kern.h. Then read and understand the memory map in mpos-kern.c, the picture at the top that explains how MiniprocOS's memory is laid out. Then look at start().

The code you'll be changing in MiniprocOS is the function that responds to system calls. MiniprocOS has set up the hardware to jump to function interrupt() upon software interrupt.

Read and understand the code for interrupt() in mpos-kern.c. Concentrate on the simplest system call, namely sys_getpid/INT_SYS_GETPID. Understand how the sys_getpid application function (in mpos-app.h) and the INT_SYS_GETPID clause in interrupt() (in mpos-kern.c) interact.

Exercise 1. Answer the following question: What would happen if you replaced the run(current); in the INT_SYS_GETPID clause with schedule();? Would the sys_getpid() system call still appear to return the correct value to the process that called it? Why?

You may have noticed, though, that the sys_fork() system call isn't working! Your job is to write the code that actually creates a new process.

Exercise 2. Fill out the do_fork() and copy_stack() functions in mpos-kern.c.

Congratulations, you've written code to create a process -- it's not that hard, no? (Our version is less than 20 lines of code.) Here's what you should see when you're done:

Now take a look at the code in mpos-app.c that calls sys_wait(). Also look at the INT_SYS_WAIT implementation in mpos-kern.c. The current system call design uses a polling approach: to wait for process 2 to exit, a process must call sys_wait(2) over and over again until process 2 exits and the sys_wait(2) system call returns a value different from WAIT_TRYAGAIN.

We'll see more about polling later in the quarter, but for now, notice that polling approaches like this often reduce utilization. A process uses CPU time to call sys_wait(2) over and over again, leaving less CPU time for others. An alternative approach, which can improve utilization, is called blocking. A blocking implementation would put sys_wait(2)'s caller to sleep, then wake it up once process 2 had exited and a real exit status was available. The sleeping process doesn't use any CPU. A process that is asleep because the kernel is waiting for some event is called blocked.

Exercise 3. Change the implementation of INT_SYS_WAIT in mpos-kern.c to use blocking instead of polling. In particular, when the caller tries to wait on a process that has not yet exited, that process should block until the process actually exits.

To implement Exercise 3, you will probably want to add a field to the process descriptor structure. This field will indicate whether or not a process is waiting on another process. You will change INT_SYS_WAIT to add the calling process to this "wait queue", and INT_SYS_EXIT to wake any processes that were on the "wait queue". There are several ways to do this; describe how you did it in answers.txt.

To check your work, try changing the sys_wait() loop in mpos-app.c to look like this:

		do {
			status = sys_wait(p);
			console_printf("W");
		} while (status == WAIT_TRYAGAIN);

A polling implementation of sys_wait would produce output like this:

You want it to produce output like this:

Cleaning Up Processes

Now try running the other MiniprocOS application. You should see something like this (different processes generally print their lines in different colors):

The MiniprocOS2 application, in mpos-app2.c, tries to run 1024 child processes.

Read and understand mpos-app2.c.

Unfortunately, your current kernel code doesn't seem able to run more than 15 total processes, ever! It looks like old, dead processes aren't being cleaned up, even after we call sys_wait() on them. This is what we call a bug.

Exercise 4. Find and fix this bug.

When you've completed this exercise, the application should count up to 1024, like this:

Your colors may differ depending on how you implement sys_wait(). One common implementation strategy ends with several red lines in a row. If you see this in your code, try to figure out why!

This completes the minilab. But here are some extra credit opportunities, if you're interested.

Extra-Credit Exercise 5. Our version of sys_fork(), with its dirt simple stack copying strategy, works only for simple programs. For example, consider the following code:

        int x = 0;  /* YOUR CODE HERE (maybe) */
        p = sys_fork();
        if (p == 0)
                /* YOUR CODE HERE */;
        else if (p > 0)
                sys_wait(p); // assume blocking implementation
        console_printf("%d", x);
        sys_exit(0);

In a system with true process isolation, the child process's x and the parent process's x would be different variables, and changes in one process would not affect the other's x. But in MiniprocOS, this is not always the case! For this exercise, produce a version of that code with the following properties:

In a system with correct process isolation, the code would print "10".
In MiniprocOS, the code would print "11".

Extra-Credit Exercise 6. MiniprocOS miniprocesses have some aspects of threads. For instance, like threads, they all share a single address space. A big difference from threads is that we create a new process by forking. New threads are created in a different way. Introduce a new system call,

pid_t sys_newthread(void (*start_function)(void));

that creates a new process in a thread-like way. The new process should start with an empty stack, not a copy of the current stack. Rather than starting at the same instruction as the parent, the new thread should start by executing the start_function function: that is, that function's address becomes the new thread's instruction pointer.

Extra-Credit Exercise 7. Introduce a sys_kill(pid) system call by which one process can make another process exit. Use this system call to alter mpos-app2.c's run_child() function so that the even-numbered processes kill off all odd-numbered processes (except for thread 1). Running the application should print out "Process N lives" messages only for even-numbered values of N.

Handing in

You will electronically hand in code and a small writeup containing answers to the numbered exercises and extra-credit exercises.

Answer the numbered exercises by editing the file in the weensyos1/ directory named answers.txt. Only ASCII text files are accepted. Please do not hand in Microsoft word documents nor PDF files. For coding exercises, it's OK for answers.txt to just refer to your code (as long as you comment your code).

Run the command make tarball to create a file named weensyos1-yourusername.tar.gz. This tarfile contains the answers.txt file.

Submit the weensyos1-yourusername.tar.gz file here.