CMSC 818e Day 12

Reading time ~2 minutes

These are notes taken during CMSC 818e: Distributed And Cloud-Based Storage Systems. Course webpage and syllabus here.

Day Twelve

Serializability

Conflict Serializability

  • Two read/write instruction “conflict” if
    • they are by different transactions
    • they operate on the same data item
    • at least one is a write instruction

why do we care?

  • if two read/write instructions don’t conflict they can be swapped without any change in the final effect
  • if they do conflict they an’t be swapped

Conflict-equivalent schedules

  • if S can be transformed into S’ through a series of swaps, S and S’ are called conflict-equivalent
  • conflict-equivalence guarantees the same final effect

Conflicting write instructions don’t matter because they’ll be overwritten anyway. The final write is the only one that matters.

View-serializability for S’ and S and each datum Q:

  • if Ti reads initial value of Q in S, must also in S’
  • if Ti reads value written from Tj in S, must also in S’
  • it Fi performs final write to Q in S, must also in S’

Testing for conflict-serializability

  • given a schedule, determine if it is conflict-serializable (this is slow N factorial)
  • draw a precedence-graph over the transactions (much more efficient!)
    • if there is a cycle in the graph, not conflict serializable
    • if there is none, it is conflict-serializable

How to guarantee serializability?

  • Allowing transactions to run, and then aborting them if the schedules aren’t serializable can be expensive

We can instead use schemes to guarantee that the schedule will be conflict-serializable.

Recoverability

  • Serializability is good for consistency
  • What if transactions fail?
  • in a recoverable schedule, we never commit a transaction that depends on a reads-from if the read-from transaction hasn’t been committed yet
  • cascading rollbacks
  • dirty read: reading a value written by a transaction that hasn’t committed yet
  • cascadeless schedules: transaction that only reads committed values
  • cascadeless implies no cascading rollbacks (which is good!)

### Concurrency Control

  • Approach: guarantee conflict-serialziablity by limiting concurrency with locks
  • Assumptions: ignoring durability (no crashes) though transactions can still abort
  • Goal: serializability and minimal impact of aborts

Lock-based protocols

  • transactions must acquire locks
    • share locks
    • exclusive locks
  • lock requess are made to the concurrency control manager
    • it decides whether to grant a lock request

2-phase Locking Protocol

  • phase 1: growing phase
    • transaction may obtain locks
    • but may not release them
  • phase 2: shrinking phase
    • only release locks
  • what about deadlock??
  • 2PL guarantees conflict-serializability
    • lock-point: the time at which a transcation acquired last lock
    • if lock-point(T1) < lock-point(T2), there can’t be an edge from T2 to T1 in the precedence graph
  • rigorous 2PL: release both exclusive and read locks only at the very end
    • makes serializablity order == the commit order
    • more intuitive behavior for users
  • strict 2PL lock conversion
    • transaction might not be sure what it needs a write lock on in advance
    • start with an S lock
    • upgrade to an X lock later if needed
    • doesn’t change any of the other properties
  • strict and rigorous 2PL ensure consistency but not deadlocks

Dealing with deadlocks

  • deadlock detected, what to do?
    • kill one, usually the youngest in the cycle
  • how to deal with them pre-emptively?
    • wait-die scheme
    • wound-wait scheme
    • both are biased in favor of the older transactions, since they’re assumed to have gotten more work done
    • but, that means the possibility of starvation
      • if a young transaction is aborted too many times, it may be given priority in continuing

Locking granularity

  • coarse vs fine - tradeoffs!

Sharding the Shards

In "Sharding the Shards: Managing Datastore Locality at Scale with Akkio", Annamalai, et al. present Akkio, a locality management service...… Continue reading