Distributed-System-CourseNote

Distributed System Course Note

Overview

  • Challenges :
    1. Workloads are large
    2. Failures
    3. Semantics
  • DS Goals :
    1. Scalability
    2. Fault tolerance
      1. Availability
      2. Durability
    3. Meaningful semantics
    4. Latency
  • Approches :
    1. Scalable, abstraction & infrastructure / Partitioning / Sharding
    2. Replication
    3. Coordination protocols
      1. Commitement protocols
      2. Consensus protocols
  • Basic architecture
  • Fix Latency
    • Cache
    • From capacity aspect, Read is better because DB doesn’t involve (If WS fits in cache) .
    • Write
      1. Write-Through cache (sync) Still bad because must reaches DB
      2. Write-Back cache (async) Good
    • Scalability
      • Throughput is better, but for writes not improve too much
    • Consistency -> Good
    • Fault tolerance
      1. Availability: Bad
      2. Durability: Write-Through (Same as 1), Write-Back -> worse

MapReduce

Map -> Input -> (Key, Value)
Reduce -> Apply f(V) to K
Reduce starts after all Mapping processes completed

HW1

Part 1

* strings.FieldsFunc (value, func) 
* strconv.Atoi(string) 
    * `tmp, err := strconv.Atoi(string)`
* List
    * list.New()
    * list.PushBack(v) 

Part 2&3

* Channel
* Goroutine

Isolation

Semantics Mechanism Properties
Serialiizability 2pl
Repeatable read
Read Committed 2PL for read write and hold until end of TX
Read Uncommitted

Consistency (replication)

Linearizability (external consistency) : For any execution of the storage system, there exists an equivalent serial order of R/W operations, where any operations O1 that starts after another operation O2, O2 precedes O1 in that serial order.

Sequential consistency: For any execution of storage system, there exists an equivalent serial order of all R/W ops that matches program order.
Causal consistency : Any execution is equivalent to an execution that enforce the order of causally dependent ops.
Eventual consistency