These are notes taken during CMSC 818e: Distributed And Cloud-Based Storage Systems. Course webpage and syllabus here.

Day One

Readings

There are readings for each class; do the readings before class and submit notes via blog by 6PM the night before the reading is due.

Projects

Project 1: A Key-Value Store in Go - implement the second half of a remote procedure call (Due midnight Sunday Sept 9th)
Project 2: Serialization, Persistence, and Immutability
Project 3: Replication, and a Cloud Client
Project 4: Distributed Consensus
Project 5: Building High-Level Abstractions from a Shared Log

Access projects via CMS Gitlab; submit via Dr. Keleher’s server.

Use Piazza to talk about projects; for Go troubleshooting.

Distributed Systems

Decentralized:

no single master (usually), point-of-failure
local control/autonomy
some have done unusual things like using entropy to decide

(often)P2P

all are peers
peers are basically the same
file-sharing:
- sometimes means data distribution, control central

Distribution and Geo-replication:

distance (latency, no simultaneity, agreement difficult)
“a DS is one where networked computers coordinate activity only by passing messages”
- Internet, intranets, mobile devices
small
- low latency
- high bandwidth
- secure, reliable interconnect
- dependent failure
- coordinated resources
big
- high latency
- low bandwidth
- autonomous
- unreliable network
- fear and distrust (byzantine failure)
- independent failures
- decentralized administration

Challenges

private communication over private networks
- who sent it (auth), integrity, privacy
building reliable systems from unreliable components
- independent failures

“A distributed system is one in which the failure of a machine I have never heard of can prevent me from doing my work. - Leslie Lamport

location
- placement for efficient sharing
- finding it later
coordination and shared state
- what should we do, and when?
- can we agree on what we’ve done?
- who is this “we” you speak of?

Reliability

recoverability:
- don’t lose data if a failure occurs (also durability)
- assume
  - nodes have storage
  - volatile: fast (memory)
  - non-volatile: slow (disks)
- but NVRAM getting cheaper, flash memory, etc
availability
survivability
also
- security, adaptability, agility, etc

Consequences of distribution

concurrent processes
- user work independently
- non-determinism, race conditions, sync, deadlock, liveliness
no global clock
- coordination through messages (CSP)
- clock sync works, but only to a degree
no global state
- no single process has knowledge of entire state of the system
failures
- node failures
- network partitions

Why Go

interfaces - no inheritance, no objects in Go, just structs
slices > array
- slices give you a pointer
- copy by value => copy by reference
- append
type safe
gothreads - lightweight
channels - the way that gothreads talk to each other. FIFO buffer, synchronize multiple writers and readers. can be single value, can be buffered
garbage collection - pervasive in Go. Can be a problem - will happen asynchronously with respect to your program, don’t know when it will happen

Preliminaries

GOPATH
go build/install/get/run …
emacs? go-mode comes w/ distribution
visual studio?
atom w/ go pro?

Gotchas

everything is passed by values: pointers, but also structs, arrays (use slices)

 var arr = [3]int{1,2,3}

 sl := arr[0:2] // first two items but not last
 fmt.Println(sl)

 sl2 := []int{4,5,6}
 fmt.Println(sl2)

 sl3 := append(sl2, 7, 8)
 fmt.Println(sl3)

 sl4 := append(sl, 7, 8)
 fmt.Println(sl4)

no pointer arithmetic
there is a ++ operator, but it’s not an expression
no type aliases
- ` type dint int` no distinct types
- can add methods to user-defined types
duck typing - hard to look at a struct and see what type it is
no ternary
no constructors
no destructors
no default args
everything is a package

The CAP Theorem

The CAP theorem states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

Consistency: Every read receives the most recent write or an error
Availability: Every request receives a (non-error) response – without guarantee that it contains the most recent write
Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes

Exercise: Loops and Functions

Practice with control flow: Fill out the definition of the square root iteratively using the Tour of Go implement a for loop:

// Modify the code below to compute a square root w/ the iterative approach
// outlined. You may use "math.Abs()", but not any other math method.
// Modify sqrt to return both a number of iterations, and the final value,
// modify main to print them out.
package main

import (
	"fmt"
	"math"
)

func Sqrt(x float64) (float64, int) {
	var z, diff, iter = 1.0, 1.0, 0
	for i := 1; i < 20; i++ {
		diff = (z*z - x) / (2 * z)
		switch {
		case math.Abs(diff) < 0.01:
			break
		default:
			z -= diff
			iter++
		}

	}
	return z, iter

}

func main() {
	z, iter := Sqrt(50)
	fmt.Printf("After %v iterations, z = %v\n", iter, z)
}

CMSC 818e Day 1

August 27, 2018