In this sequence of labs, you'll build a multi-server file system called Yet-Another File System (yfs) in the spirit of Frangipani. At the end of all the labs, your file server architecture will look like this:
You'll write a file server process, labeled yfs above, using the FUSE toolkit. Each client host will run a copy of yfs. yfs will appear to local applications on the same machine by registering via FUSE to receive file system events from the operating system. The yfs extent server will store all the file system data on an extent server on the network, instead of on a local disk. yfs servers on multiple client hosts can share the file system by sharing a single extent server.
This architecture is appealing because (in principle) it shouldn't slow down very much as you add client hosts. Most of the complexity is in the per-client yfs program, so new clients make use of their own CPUs rather than competing with existing clients for the server's CPU. The extent server is shared, but hopefully it's simple and fast enough to handle a large number of clients. In contrast, a conventional NFS server is pretty complex (it has a complete file system implementation) so it's more likely to be a bottleneck when shared by many NFS clients.
The class machines are called {brain,eye,heart}@news.cs.nyu.edu.
If you want to do labs on your own Linux machine, install FUSE by doing apt-get install libfuse2 libfuse-dev fuse-utils (if it's a Ubuntu/Debian box). Or, you can compile and install fuse-2.6-3 from source.
ds-class-debian.zip (590 MB)
Non-root account: username=notroot, password=6.82fork
Root account: password=6.82fork
You should be able to run this with VMWare Player, which is available for free from VMWare. We have tested it on Linux; it should work for Windows as well. Note that the official environment for the labs is still the class machines, and that's where we will be testing your code. So you should always test your code on the class machine before handing it in! For example, it's likely that you need to implement both the CREATE and MKNOD operations as different operating systems send either CREATE or MKNOD in response to the creation of a file. Also note that we don't really have the energy or expertise to debug any VMWare problems you might have. This is just meant to be helpful for those of you who don't have personal Linux machines, and find it inconvenient to log in at a class machine.
There has been at least one report that running the code in VMWare is much slower than running it directly on hardware, and as a result, RPCs timeout more often even when RPC_LOSSY is unset. Be on the lookout for errors related to this.
NOTE:You may find that the network doesn't work when you boot the image up for the first time. This is a known problem with the Thought Police image. To fix this, run the following command as root:
$ rm /etc/udev/rules.d/z25_persistent-net.rules && reboot
To install additional packages on this image, login as root and do apt-get NAME_OF_DEBIAN_PACKAGE.
Pthread interfaces are somewhat cumbersome to use. You can (optionally) take a look at a simple scoped lock implementation in rpc/slock.h. Check the RPC library code to understand how it is used. You are not supposed to use the Boost C++ library for this lab (Boost provides various wrapper classes for pthread primitives).
printf statements are always your friend when debugging any kind of problem in your programs. However, when programming in C/C++, you should always be familiar with gdb, the GNU debugger. You may find this gdb reference useful. Below introduces a few gdb tips for complete newbies:
If your program is crashing (segmentation fault), type gdb program core where program is the name of the binary executable to examine the core file. If you don't find the core file anywhere, type ulimited -c unlimted befor starting your program again. Once inside gdb, type bt to examine the stack trace when the segmentation fault happened.
While your programming is running, you can attach gdb to it by typing gdb program 1234. Again, program is the name of the binary executable. 1234 is the process number of your running program. Of course, you can choose to run your program with gdb from the beginning. If so, simply type gdb program. Then at the gdb prompt, type run.
While in gdb, you can set breakpoints (use gdb command b) to stop the execution at specific points, examine variable contents (use gdb command p), etc.
To apply a given gdb command to all threads in your program, prepend thread apply all to your command. For example, thread apply all bt dumps the backtrace for all threads.
W. Richard Stevens' books ``UNIX Network Programming'' Volume 1 and 2 are classic references for network programming. If you are struggling with the sockets interface it could be a helpful purchase. See the suggested books list for other helpful references.