Introduction
This lab is optional. If you choose not to do this lab, then your final exam counts as 42% (i.e. 35%+7%) of your total grade. If you complete this lab, then your final exam counts as 35% of your total grade.
In this lab, you are going to implement a hash table that allows multiple threads to concurrently perform insert and lookup operations on it.
This is an individual lab. Please work on this lab in your individual private git repo.
Obtaining the lab
$ cd cso-labs/ # your individual git repo
$ git fetch upstream
$ git merge upstream/master
This creates a number of new files in subdirectory threadlab/.
The only files you should modify are htable.{h,c} and rwlock.{h,c}
Step 1: Make a fixed-size hash table thread-safe
Let's first solve the easy problem of adding locks to a hash table that does not do resize.
Read the given hash table implementation in files htable.{h,c}. We can see that the hash table contains an array of pointers. Each slot of the array contains
the head of a singly-linked list that chains together those inserted items that have hashed to the same slot. Function htable_init initializes the hash table with a certain number of slots. It also remembers whether the initialized hash table is allowed to resize itself or not. For step 1, the hash table does not resize.
To insert a tuple whose key is a C string and whose value is a (void *) pointer, function htable_insert computes a hashcode based on the key and finds the tuple's corresponding slot based on the hashcode. It then inserts the tuple into the linked list at that slot.
Add locks to the hash table code (you should only change htable.c and htable.h) so that multiple threads can insert and lookup items in the hash table concurrently.
The easiest locking strategy is to use a single big lock. However, this strategy does not leave any room for concurrency (resulting in bad performance with multiple threads). You are expected to use a fine-grained locking strategy.
Hint: you only need to use these pthread functions: pthread_mutex_init, pthread_mutex_destroy, pthread_mutex_lock, pthread_mutex_unlock.
To test your code, run ./tester -t htable. This test uses a fixed-size hash table.
Below is an example of a successful test.
$ make
$ ./tester -t htable
Initialized hash table of size 10007
Spawned 4 threads, each to insert 500000 tuples
All 4 threads finished. Throughput is 1.578250 inserts/sec
Spawned 4 threads, each to insert or lookup 500000 tuples
All 4 threads finished. Throughput is 1.195037 lookups/sec
--- HTABLE TEST PASSED (final htable size 10007)
You can change the number of threads that perform concurrent hash table operations by using
the -n option. For example, ./tester -t htable -n 1 uses a single thread
to perform hash table operations (i.e. there are no concurrent operations in this test). As another example, ./tester -t htable -n 2 performs hash table operations using two threads.
Step 2: Implement a reader-writer lock
In order to make hash table resize thread-safe, we are going to need an utility called the
reader-writer lock. Read the skeleton code of the reader-writer lock in rwlock.{c,h}.
A reader-writer lock can be acquired in either the "read" (also referred to as "shared") or "write" (also referred to as "excluse") mode. In the given code skeleton, function rwl_rlock attempts to grab the lock in "read" mode, rwl_runlock releases the grabbed lock in "read" mode. Similarly, function rwl_wlock and rwl_wunlock locks and unlocks in "write" mode.
The reader-writer lock allows multiple readers but only one writer at a time. In other words, If the lock has been acquired in the "read" mode, then other
threads can also grab the lock in the "read" mode without blocking. If the lock has
been acquired in the "write" mode, then no other thread should be able to acquire the lock in
either "read" or "write" mode.
Hint: you need to use conditional variable(s) to implement the reader-writer locks. The pthread functions related to conditional variables that you may need to use are: pthread_cond_init, pthread_cond_wait, pthread_cond_timedwait, pthread_cond_signal, pthread_cond_broadcast.
In order to faciliate testing, function rwl_rlock and rwl_wlock take a second
argument "expire" and requires any conditional wait inside the function to last no later than absolute time "expire". Therefore, we ask you to use our wrapper function cond_timedwait instead of pthread_cond_wait . When cond_wait returns ETIMEDOUT if timeout happens. It returns 0 if it has been woken up by a pthread_cond_signal or pthread_cond_broadcast. In both cases, the corresponding mutex (the second argument of cond_wait) is locked.
The reader-writer lock can have different scheduling policies. For this lab, we ask you to
prioritize the writer. In other words, as long as some writer is waiting to acquire the lock in "write" mode, no subsequent readers are allowed to acquire the lock in "read" mode.
To facilitate testing, we also ask you to implement rwl_nwaiter to return the number of threads who are currently waiting to acquire the lock in either mode.
To test your implementation, run ./tester -t rwl. Below is an example of a successful test:
$ ./tester -t rwl
Reader grabbed the lock
Another reader grabbed the lock
One reader unlocked
Another reader unlocked
Writer grabbed the lock
Writer unlocked
--- RWLOCK TEST PASSED basic correctness
Reader has locked but not yet unlocked
Spawned a writer who is waiting to lock
Spawned 8 readers who are waiting to lock
Writer locked before readers
--- RWLOCK TEST TEST PASSED (writer is given priority)
Step 3: Use reader-writer lock to make a resizable hash table thread-safe
Now that you have the reader-write lock, we can proceed to handle the scenario of hash table resize.
In particular, we can think of the hash table resize operation as an
"exclusive" activity: if a hash table resize is ongoing, no other threads can perform any other other hash table operations, such as lookups, inserts, and resize. On the other hands,
hash table lookups and inserts can be viewed as "shared" activities: multiple
hash table lookups and inserts can occur at the same time (provided that you have already locked properly in step 1) in the absence of resize.
Use your reader-writer lock in htable.{cc,h} to deal with resize. When you are done,
test using ./tester -t resize. An example of a successful run is as follows:
Initialized hash table of size 10007
Spawned 4 threads, each to insert 500000 tuples
All 4 threads finished. Throughput is 2.007357 inserts/sec
Spawned 4 threads, each to insert or lookup 500000 tuples
All 4 threads finished. Throughput is 1.923626 lookups/sec
--- RESIZE TEST PASSED (final htable size 1282529)
Run all tests
To run all tests together, type ./test -t all.
Make sure you run the tests many times as concurrency bugs are non-deterministic and may show up
rarely.