Distributed Systems - Fall 2019

Lab 1: Primary-backup replication

Due: 9/22 at 11:59pm

Introduction

In this lab, you will implement a primary-backup replication protocol based on Viewstamped replication (VR). Specifically, your primary-backup service would replicate a log (i.e. a sequence of arbitary entries) across a collection of servers and would be able to survive the node and network failure, as long as a majority of nodes are functioning and can communicate with each other.

Before starting this lab, familiarize yourself with Go by completing the Online Go tutorial. This lab may be your first exposure to writing challenging concurrent code and your first implementation may not be clean enough that you can easily reason about its correctness. Give yourself enough time to construct a clean and readable implementation so that you can easily reason about its correctness. Although subsequent labs are not based on your implementation in this lab, your experience in writing clean code will greatly help later labs.

In this lab, we expect your servers to exchange RPCs using the labrpc Go package that we provide to you. It is modeled after Go's rpc library, but internally uses Go channels rather than sockets. The reason you must use labrpc instead of Go's RPC package is that the tester tells labrpc to delay RPCs, re-order them, and delete them to simulate challenging network conditions under which your code should work correctly. Don't modify labrpc because we will test your code with the labrpc as handed out.

Getting Started

Follow the lab setup instructions here.

Do a git pull to get the latest lab software. The code related to this lab are in src/simplepb (which contains skeleton code for your primary-backup replicated log service) and and our lab RPC in src/labrpc.

To get up and running, execute the following commands:

$ cd ~/golabs
$ git pull
...
$ cd src/simplepb
$ export GOPATH=$HOME/golabs
$ go test -v
=== RUN   Test1ABasicPB
--- FAIL: Test1ABasicPB (0.00s)
	config.go:242: node-0 rejected command
...
$

When you've finished both parts of the lab, your implementation should pass all the tests in the src/simplepb directory:

$ go test -v
=== RUN   Test1ABasicPB
 ... Passed
--- PASS: Test1ABasicPB (0.80s)
=== RUN   Test1AConcurrentPB
 ... Passed
--- PASS: Test1AConcurrentPB (12.10s)
=== RUN   Test1AFailButCommitPB
 ... Passed
--- PASS: Test1AFailButCommitPB (0.18s)
=== RUN   Test1AFailNoCommitPB
 ... Passed
--- PASS: Test1AFailNoCommitPB (9.79s)
=== RUN   Test1BSimpleViewChange
--- PASS: Test1BSimpleViewChange (4.02s)
=== RUN   Test1BConcurrentViewChange
--- PASS: Test1BConcurrentViewChange (6.02s)
PASS
ok  	simplepb	33.326s

The code at a glance

Implement primary-backup replication by adding code to simplepb/server.go. In that file you'll find a bit of skeleton code, plus examples of how to send and receive RPCs.

Your implementation must support the following interface, which the tester will use. You'll find more details in comments in server.go.

// create a new primary-backup server instance:
srv := Make(peers, me, startingView)

// try to add an entry to the log
srv.Start(command interface{}) (index, view, ok)

// find out whether the position "index" in the log has been committed
srv.IsCommitted(index) (committed)

// prompting the server to start changing to newView
srv.PromptViewChange(newView) 

// find out the current view of this server and whether its status is NORMAL
srv.ViewStatus() (currentView, statusIsNormal)

// returns the log entry at position index
srv.GetEntryAtIndex(index) (command)

A service calls Make(peers,me,startingView) to create a new server. The peers argument is an array of established RPC connections, one to each server replica (including this one). The me argument is the index of this server in the peers array. startingView is the view number of the initial view that all servers start in; it is typically set to zero. Start(command) asks a server to start adding the command to the replicated log. Start() should return immediately, without waiting for this process to complete. Our tester calls IsCommitted(..) to check that some log entry has been "committed" at the expected index.

Part 1A: Primary-backup in the Normal case

Your job for part-1A is to implement the replication of a log entry in the normal case when the primary does not fail. When you are finished, test your code for part-1A using

$ go test -v -run 1A

When Start(command) is invoked, the primary should append the command in its log and then send Prepare RPCs to other servers to instruct them to replicate the command in the same index in their log. Note that a server should do the processing for Start only if it believes itself to be the current primary and that its status is NORMAL (as opposed to VIEW-CHANGE or RECOVERY).

Upon receiving a Prepare RPC message, the backup checks whether the message's view and its currentView match and whether the next entry to be added to the log is indeed at the index specified in the message. If so, the backup adds the message's entry to the log and replies Success=ok. Otherwise, the backup replies Success=false. Furthermore, if the backup's state falls behind the primary (e.g. its view is smaller or its log is missing entries), it performs recovery to transfer the primary's log. Note that the backup server needs to process Prepare messages according to their index order, otherwise, it would end up unnecessarily rejecting many messages.

If the primary has received Success=true responses from a majority of servers (including itself), it considers the corresponding log index as "committed". (Since servers process Prepare messages ) It advances the committedIndex field locally and also piggybacks this information to backup servers in subsequent Prepares.

When you've finished implementing what has been described above, you should be able to pass the first two of three tests for part-1A; namely, Test1ABasicPB and Test1AConcurrentPB. Run these two tests using:

$ go test -v -run 1ABasic
$ go test -v -run 1AConcurrent

If a backup node has crashed, the system can continue undisturbed since the primary only needs to receive a majority successful Prepare replies in order to consider a log entry committed. When the crashed backup node comes back online, it will find that its log is out of sync with the primary and unable to respond positively to the primary's Prepares. In this case, the backup needs to synchronize with the primary by sending a Recovery RPC. The primary replies to the Recovery by sending its log in its entirety. Once the backup catches up with the primary, it can respond positively to the primary Prepare messages. (Note that our recovery mechanism here is much simpler than the one described in "Viewstamp: revisited" paper (4.3). This is because we assume nodes log operations to persistent storage before replying to primary's Prepare requests. Thus, it is never the case that a recovered node "forgets" the operations that it has already prepared and relied PrepareOK to. By contrast, VR paper's recovery protocol handles the case of "forgetful" recovered nodes.)

When you've finished implementing the Recovery aspect, you should be able to pass the all three tests of part-1A by typing go test -v -run 1A

Part 1B: Primary-backup with View-change

Your job for part-1B is to implement the view change protocol of VR, so that if an existing primary has crashed, the system can switch to using a new primary.

To guarantee correctness (aka consistency), the protocol must ensure that 1) no two primaries can both commit entries simultaneously 2) all log entries that have been considered committed by a primary at view v-1 are present in the log of the primary at view v. The VR protocol guarantees both using the idea of quorum intersection.

Our lab has simplified the view change protocol described in the "Viewstamp revisited". As a further simplifying step, we also ignore the practical issue of discovering when nodes should perform view-change through period health checks. Instead, a server only initiates a view change only when explicitly told by the tester.

Below, we describe how the view-change process should work in this lab.

The tester prompts a view-change by invoking the PromptViewChange(newView) function on the primary server for the newView. Recall that in VR, the view number uniquely determines the primary server. In our implementation, all servers can be identified by their index in the peers array and we map each view number to the id of the primary as: view-number % total_servers (The auxilary function GetPrimary(view, nservers) performs this calculation).

The primary servers starts the view-change process by sending a ViewChange RPC to every replica server (including itself). Upon receving ViewChange, a replica server checks that the view number included in the message is indeed larger than what it thinks the current view number is. If the check succeeds, it sets its current view number to that in the message and modifies its status to VIEW-CHANGE. It replies Success=true and includes its current log (in its entirety) as well as the latest view-number that has been considered NORMAL. If the check fails, the backup replies Success=false.

If the primary has received successful ViewChange replies from a majority of servers (including itself). It can proceed to start the new view. It needs to start the new-view with a log that contains all the committed entries of the previous view. To maintain this invariant, the primary chooses the log among the majority of successful replies using this rule: it picks the log whose lastest normal view number is the largest. If there are more than one such logs, it picks the longest log among those. Once the primary has determined the log for the new-view, it sends out the StartView RPC to all servers to instruct them to start the new view. Upon receive StartView, a server sets the new-view as indicated in the message and changes its status to be NORMAL. Note that before setting the new-view according to the StartView RPC message, the server must again check that its current view is no bigger than that in the RPC message, which would mean that there's been no concurrent view-change for a larger view.

We have implemented the logic for sending out ViewChange and StartView RPCs for you, in the PromptViewChange function. In order to complete the view change process, you need to complete the implementation of three functions. Specifically, you need to complete the two RPC handlers, ViewChange(..) and StartView(...). You also need to complete the function determineNewViewLog which is invoked after the server collects a set of ViewChange replies to decide on the log to be used for the new-view.

Once you have finished the implementation, you can test by doing:

$ go test -v -run 1B

Hand in procedure

Once you have finished both part-1A and part-1B, perform all the tests many times (bugs in concurrent programs often only surface for specific runs of the program) to ensure that your program can pass the tests stably.

Hand in your code by pushing to the github repository before the deadline.

$ git commit -am "Finish lab-1"
$ git push